How to use metacharacters in combination with cell arrays to build up a pattern for regexp?

3 visualizaciones (últimos 30 días)
I just started coding, and I can't manage regexp properly and I guess my codes are not that efficient, but here is what I would need:
I wrote a code like this, where X and Y are both cell arrays (where every cell consists of string arrays).
Y{1} is defined.
for i=1:length(X)
for k=1:length(Y{i})
Z{i}(k+1)=max(find(~cellfun(@isempty,regexp(X{i}, Y{i}{k}))));
...
end
end
This works fine cause I expect Z to be a cell array of indices, indices of max position of occurence of the string contained in Y{i}{k}.
The problem is that this string (let's say for example "22") is contained also in other strings within Y{i}{k} (let's say "3422"), but my goal is to get the max index of occurence of the string "22" with no other digits or characters before or after it.
What I ask now is a method, possibly using metacharacters, to get the max index of occurence of just the string Y{i}{k} with no digits or characters behind or after.
I thought of using something like --> '(?<!\d)(\d)(?!\d)' matches single-digit numbers (digits that do not precede or follow other digits),
but it seems hard to match metacharacters and Y{i}{k} in the same pattern. I got a snapshot below of what I attempted with no success, it might help.
Any idea of how to solve this? Thanks in advance!!

Respuesta aceptada

Stephen23
Stephen23 el 13 de Jul. de 2020
Editada: Stephen23 el 13 de Jul. de 2020
"my goal is to get the max index of occurence of the string "22" with no other digits or characters before or after it."
I assume for now that by "no characters" you actually mean no non-whitespace characters, or perhaps no letter characters. Clarification would be helpful on this..
Possibly a good appraoch would be to use anchors:
and you can trivially join these with the Y string either concatenation or sprintf, e.g.:
rgx = sprintf('\\<%s\\>',Y{i}{k});
regex(X{i},rgx)
Most likely you can remove on of the loops by constructing one regular expression for all of the k strings, e.g.:
rgx = '\<(22|23|99|123)\>';
You can gemerate that quite easily using sprintf or strjoin or similar.
"No Characters": your description that you want "no other digits or characters before or after it" literally means that you would only match this string "22", because whitespace , ounctuation, and non-printing characters are also characters, so by your description must be excluded... leaving literally no characters.
I assumed above that you actually mean to match the substring
"... 22 ..."
% ^ ^ whitespace, not letter, not digit
Are punctuation characters allowed or not? If you really do mean "no characters" then using regular expressions is an overly complex way to match literal strings.
  3 comentarios
Stephen23
Stephen23 el 13 de Jul. de 2020
Editada: Stephen23 el 13 de Jul. de 2020
A simple string comparison will do that task simply and efficiently:
>> C = {'22';'asa';'22';'33';'2245';'stan'};
>> X = strcmp(C,'22')
X =
1
0
1
0
0
0
>> max(find(X))
ans = 3
Regular expressions are just a red-herring: they will be much less efficient than a simple string comaprison. But if you really want to use a regular expression:
X = ~cellfun(@isempty,regexp(C,'^22$'))
Davide Festa
Davide Festa el 14 de Jul. de 2020
Thanks a lot, it works perfectly by using strcmp.
Now I see that by using regexp I made the problem more complicated than it was!

Iniciar sesión para comentar.

Más respuestas (0)

Categorías

Más información sobre Characters and Strings en Help Center y File Exchange.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by