Borrar filtros
Borrar filtros

Regex for string match

5 visualizaciones (últimos 30 días)
ML_Analyst
ML_Analyst el 24 de Sept. de 2023
Comentada: ML_Analyst el 25 de Sept. de 2023
I have a huge string array 1*50000 length like below:
Stock_field1_img
Sys_tim_valt98.qaf.rat.app.gui
Enable1.HSB_setblcondition.Enable_logic.ui
P_k12.delay.init_func_delay_update.Sys
#fat_11ks.ergaa.ths.dell
$thispt.dynmem11.ide.gra
.....
.....
I am looking for a regex, which can search this array based on "user input". For ex,
if user gives st* then it should get all the strings starting with "st" ,
if user gives *st then it should get all strings ending with "st",
if user gives *st* then it should get all strings which has st in between start and end,
user can also give *st*app.*sys* then it should list all combinations which has strings with st in between, followed by app. in between and followed by sys in between.
I tried multiple combos like below and also other combinations
expression = '\w* + signal + \w*';
a = regexp(str_array, ,'match','ignorecase');
but doesn't work as intended, could someone help with this.

Respuesta aceptada

Voss
Voss el 24 de Sept. de 2023
I think it may be tricky to get this to work for any possible expression the user may enter, because every special character used in regexp will have to be modified in the user-input expression. For example, you want * to represent any character sequence, which in regexp is .* so you have to replace * with .* in the user-input expression before passing to regexp; other special characters you want to treat literally have to be escaped (by prepending \), so that . becomes \. and $ becomes \$ etc. The function get_matches defined below does this replacement explicitly for a few special characters before passing the expression to regexp and returns the matches. You can add more special characters to it as needed.
str = [
"Stock_field1_img"
"Sys_tim_valt98.qaf.rat.app.gui"
"Enable1.HSB_setblcondition.Enable_logic.ui"
"P_k12.delay.init_func_delay_update.Sys"
"#fat_11ks.ergaa.ths.dell"
"$thispt.dynmem11.ide.gra"
];
user_input = "st*"; % return any string starting with st
matched_str = get_matches(str,user_input)
matched_str = "Stock_field1_img"
user_input = "*.sys"; % ending with .sys
matched_str = get_matches(str,user_input)
matched_str = "P_k12.delay.init_func_delay_update.Sys"
user_input = "*del*"; % containing del
matched_str = get_matches(str,user_input)
matched_str = 2×1 string array
"P_k12.delay.init_func_delay_update.Sys" "#fat_11ks.ergaa.ths.dell"
user_input = "$*"; % starting with $
matched_str = get_matches(str,user_input)
matched_str = "$thispt.dynmem11.ide.gra"
user_input = "*.*d*.*"; % containing d somewhere between two .s
matched_str = get_matches(str,user_input)
matched_str = 3×1 string array
"Enable1.HSB_setblcondition.Enable_logic.ui" "P_k12.delay.init_func_delay_update.Sys" "$thispt.dynmem11.ide.gra"
function a = get_matches(str,user_input)
regex = replace(user_input,["*",".","$","^"],[".*","\.","\$","\^"]);
a = rmmissing(regexpi(str,"^"+regex+"$",'match','once'));
end
  3 comentarios
Stephen23
Stephen23 el 25 de Sept. de 2023

Note that regexptranslate can be used to escape all special characters:

https://www.mathworks.com/help/matlab/ref/regexptranslate.html

ML_Analyst
ML_Analyst el 25 de Sept. de 2023
Thanks @Stephen23

Iniciar sesión para comentar.

Más respuestas (0)

Categorías

Más información sobre Characters and Strings en Help Center y File Exchange.

Productos


Versión

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by