Borrar filtros
Borrar filtros

How can I use regexp to return a list of variable names?

15 visualizaciones (últimos 30 días)
Joshua Muse
Joshua Muse el 22 de Feb. de 2018
Editada: Stephen23 el 23 de Feb. de 2018
I want to extract variable names from a string. For my purposes, variables start with a letter or an underscore, and they end before anything except for open parenthesis ("(").
So
'sin(var1) + var2'
should become something like
["var1" "var2"]
I tried this:
testStr = 'sin(var1) + var2';
vars = regexp(testStr,'([a-zA-Z_]\w*)(?:[^(\w]|$)','tokens')
and got this:
ans =
0x0 empty cell array
What am I doing wrong?
  3 comentarios
Joshua Muse
Joshua Muse el 23 de Feb. de 2018
You're right. I was not very clear about my definition of a variable. A variable:
  • begins with either an English letter or an underscore
  • contains any number of English letters, underscores, and digits
  • ends before something that is not an English letter, underscore, or digit
In the string:
'var1 * _var2 * 2var3 + func(var5))'
the variables are:
var1
_var2
var3
var5
notice that the 2 in front of var3 is not included, nor is "func."
I've tested the expression on regex101 and it matches all of the correct expressions, but when I call regexp with the arguments 'tokens', I don't get an array of the token text like I was expecting.
Stephen23
Stephen23 el 23 de Feb. de 2018
Editada: Stephen23 el 23 de Feb. de 2018
func meeets these three condidrions:
  • "begins with either an English letter or an underscore" yes!
  • "contains any number of English letters, underscores, and digits" yes!
  • "ends before something that is not an English letter, underscore, or digit" yes!
So why is func not on your list of variables when it meets all of your conditions?
Note that the regular expression you defined on regex101 actually uses "ends before something that is not an English letter, underscore, digit, or open bracket".

Iniciar sesión para comentar.

Respuestas (2)

Ji Huang
Ji Huang el 23 de Feb. de 2018
Editada: Ji Huang el 23 de Feb. de 2018
I would do it in two steps. First, remove the functions. i.e. characters before open parenthesis
testStr = 'sin(var1) + var2';
var_step_1 = regexprep(testStr,'[\w_]{0,}\(', '\(')
It gives "(var1) + var2". Then, match vars.
var_step_2 = regexp(var_step_1,'[\w_]{0,}', 'match')

Stephen23
Stephen23 el 23 de Feb. de 2018
Editada: Stephen23 el 23 de Feb. de 2018
I used regexpi for simplicity:
>> str = 'var1 * _var2 * 2var3 + func(var5))';
>> C = regexpi(str,'([A-Z_]\w*)(?![\(\w])','match');
>> C{:}
ans = var1
ans = _var2
ans = var3
ans = var5
If you want to develop regular expressions then you might be interested in downloading my simple Interactive Regular Expression tool:

Categorías

Más información sobre Logical en Help Center y File Exchange.

Etiquetas

Productos

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by