Regexp with a list of keywords

13 visualizaciones (últimos 30 días)
Stephan Kolb
Stephan Kolb el 10 de Sept. de 2019
Editada: Stephen23 el 10 de Sept. de 2019
Hi everbody,
I'm aware regexp is a powerful tool, but I'm a newbie to this and I think my problem is an advanced one:
There is a list of keywords
keywords = {'alpha','basis','colic','druid','even'};
and I want regexp to find in a string all values after these keywords (followed by =).
For example:
str = 'basis=10,alpha=today,druid=none,even=odd,even=even';
gives
'today','10',[],'none',{'odd','even'}
Can you help me?

Respuesta aceptada

Stephen23
Stephen23 el 10 de Sept. de 2019
Editada: Stephen23 el 10 de Sept. de 2019
Using a simple lookaround assertion:
>> keywords = {'alpha','basis','colic','druid','even'};
>> str = 'basis=10,alpha=today,druid=none,even=odd,even=even';
>> fun = @(k)regexp(str,sprintf('(?<=%s=)\\w+',k),'match');
>> out = cellfun(fun,keywords,'uni',0);
>> out{:}
ans =
'today'
ans =
'10'
ans =
{}
ans =
'none'
ans =
'odd' 'even'
Note that each cell of out is itself a cell array, with varying sizes. If you want to unnest the scalar cells, as you indicate in your question, then try this:
>> idx = cellfun(@isscalar,out);
>> out(idx) = [out{idx}]
out =
'today' '10' {} 'none' {1x2 cell}
>> out{:}
ans =
today
ans =
10
ans =
{}
ans =
none
ans =
'odd' 'even'
  2 comentarios
Stephen23
Stephen23 el 10 de Sept. de 2019
Stephan Kolb's "Answer" moved here:
Hi Stephen,
thank you very much for your quick and smart answer!!!
Do you think, we can map your solution to string arrays, e.g.
str = ["basis=10","alpha=today","druid=none","even=odd","even=even"];
Thank you in advance,
Stephan
Stephen23
Stephen23 el 10 de Sept. de 2019
Editada: Stephen23 el 10 de Sept. de 2019
"Do you think, we can map your solution to string arrays"
Sure: use a loop or arrayfun or concatenate the data into one character vector / a scalar string or fiddle around with the cell array outputs of regexp. Whichever works for you.
But if your data really are separated (and not in one character vector as your showed in your question), then I would probably just split them at each = character, compare the 1st parts using strcmp or the like, and then use accumarray or similar to group together.

Iniciar sesión para comentar.

Más respuestas (1)

Walter Roberson
Walter Roberson el 10 de Sept. de 2019
keywords = {'alpha','basis','colic','druid','even'};
kv = regexp(str, '(?<name>\w+)=(?<value>\w+)', 'names');
values = cell(1, length(keywords));
[found, idx] = ismember(keywords, {kv.name});
values(found) = {kv(idx(found)).value};
  1 comentario
Stephan Kolb
Stephan Kolb el 10 de Sept. de 2019
Consequently, when using a string array for str, it's better to use a string array for list the list of keywords, too.
So we have:
keywords = ["alpha","basis","colic","druid","even"];
str = ["basis=10","alpha=today","druid=none","even=odd","even=even"];
Can we adapt your solution?

Iniciar sesión para comentar.

Categorías

Más información sobre Characters and Strings en Help Center y File Exchange.

Etiquetas

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by