Simplify a string with MATLAB script

Hello !
I have a string format which looks like this (this is not always 'A' before '_' and numbers) :
Eq = 'A_1+((A_2+A_3)&(A_4+A_5))+A_6';
How can i simplify the string like this (with a script) :
Eq = 'A_(1+((2+3)&(4+5))+6)'
For me the simplest way would be to delete all 'A_' iteration except the first one and to add a '(' after the first 'A_' but i don't know how to do it in a script.
Thank you in advance for your help !
Edit : After Stephen answer review, i realized i missed a detail : sometimes the string can be like this :
'A_number_1+((A_number_2+A_number_3)&(A_number_+A_number_5))+A_number_6';
==> 'A_number_(1+((2+3)&(4+5))+6)'
With multiple string for example : 'String_String_String_1' . Is there anyway to delete string and '_' behind a number ?

 Respuesta aceptada

Stephen23
Stephen23 el 27 de Feb. de 2020
Editada: Stephen23 el 27 de Feb. de 2020
>> Eq = 'A_1+((A_2+A_3)&(A_4+A_5))+A_6';
>> Fq = regexprep(Eq, '^([A-Z]+_)(.*)', '$1\(${strrep($2,$1,'''')}\)')
Fq =
A_(1+((2+3)&(4+5))+6)

10 comentarios

Lucas S
Lucas S el 27 de Feb. de 2020
Thanks ! I really need to understand how to use regexp, i clearly didn't understand how you did this.
Lucas S
Lucas S el 27 de Feb. de 2020
Editada: Lucas S el 27 de Feb. de 2020
And if for example the string is more like that :
Eq = 'A_number_1+((A_number_2+A_number_3)&(A_number_+A_number_5))+A_number_6';
==> 'A_number_(1+((2+3)&(4+5))+6)'
Is there a way to delete all string before a '_' ? Because with this script it only simplify 1st string which is 'A_' here
Stephen23
Stephen23 el 27 de Feb. de 2020
Editada: Stephen23 el 28 de Feb. de 2020
"Is there a way to delete all string before a '_' ?"
Of course, you just need to specify exactly what you require.
"Because with this script it only simplify 1st string which is 'A_' here"
Because that is exactly what you specified in your question.
So, making a few guesses... perhaps something like this does what you want (I added the missing '4'):
>> Eq = 'A_number_1+((A_number_2+A_number_3)&(A_number_4+A_number_5))+A_number_6';
>> Fq = regexprep(Eq, '^(\w+?)(\d+\W.*)', '$1\(${strrep($2,$1,'''')}\)')
Fq =
A_number_(1+((2+3)&(4+5))+6)
An example where I replaced "number" with some random digits:
>> Eq = 'A_123_1+((A_123_2+A_123_3)&(A_123_4+A_123_5))+A_123_6';
>> Fq = regexprep(Eq, '^(\w+?)(\d+\W.*)', '$1\(${strrep($2,$1,'''')}\)')
Fq =
A_123_(1+((2+3)&(4+5))+6)
And with the original example from your question:
>> Eq = 'A_1+((A_2+A_3)&(A_4+A_5))+A_6';
>> Fq = regexprep(Eq, '^(\w+?)(\d+\W.*)', '$1\(${strrep($2,$1,'''')}\)')
Fq =
A_(1+((2+3)&(4+5))+6)
How the regular expression is defined:
'^(\w+?)(\d+\W.*)'
%^ match start of string
% ( ) token 1
% \w+? lazy match of letters, digits, and underscore
% ( ) token 2
% \d+ greedy match of digits
% \W match one non-letter, non-digit, non-underscore (e.g. plus, parenthesis, etc.)
% .* greedy match of any characters
Token 1 matches the 'String_String_String_' part of the name (or whatever you call it), but lazily (i.e. as few characters as possible), while token 2 greedily matches the digits, e.g. '1' (i.e. it tries to collect as many digits as possible). The \W forces the digits at the start of token 2 to be the last digit/s in the name (or whatever you call it), even if digits occur elsewhere in token 1. The replacement string works like this:
'$1\(${strrep($2,$1,'''')}\)'
%$1 token 1, e.g. 'String_String_'
% \( \) literal parentheses
% ${ } dynamic expression to call function
% strrep($2,$1,'''') all instances of token 1 in token 2 replaced with ''
Lucas S
Lucas S el 28 de Feb. de 2020
Editada: Lucas S el 28 de Feb. de 2020
Thank you very much ! I think i understand how you did it so i can adapt if my string has more 'string_' iterations
Lucas S
Lucas S el 28 de Feb. de 2020
Ok i think with regex i can't solve my problem as my string has never the same format.
Stephen23
Stephen23 el 28 de Feb. de 2020
"I think i understand how you did it so i can adapt if my string has more 'string_' iterations "
You certainly do NOT need to "adapt" the regular expression I gave in my previous comment for more "'string_' iterations", because that regular expression does not know or or care about "iterations" within the name. It only matters that the name contains one or more letters, digits, and/or underscores. What order those characters are in is totally irrelevant, nor is the fact that a human might see "iterations" within the name.
"Ok i think with regex i can't solve my problem as my string has never the same format."
What "format" are you referring to? The "format" of the entire string or just the leading name?
Rather than jumping to incorrect conclusions based on incorrect understandings of regular expressions, I recommend communicating the exact "format" requirements that you have, together with some examples.
Lucas S
Lucas S el 28 de Feb. de 2020
Editada: Lucas S el 28 de Feb. de 2020
In the future i would insert this script in a loop with many different string.
The strings have the same format :
'string_string_string_number'
But the lengths can be different :
it can be 'string_string_number'
or 'string_string_string_number'
or 'string_string_string_string_number'
etc...
I found a solution that can work : Store one ieration of the full string (before the number) delete all of it and add it again at the start.
Or use regexp but for me regexp is only for same format's same length's string (i'm probably wrong)
I made this and i think it's working (ugly but working) :
A = 'A_number_1+((A_number_2+A_number_3)&(A_number_4+A_number_5))+A_number_6';
CR_string = '';
for i=1:length(A)
if A(i) == '('
else
if str2double(A(i)) == 1
break;
else
CR_string = strcat(CR_string, A(i));
end
end
end
A = erase(A, CR_string);
A = strcat(CR_string, '(', A, ')');
disp(A);
Stephen23
Stephen23 el 28 de Feb. de 2020
Editada: Stephen23 el 28 de Feb. de 2020
"'Or use regexp but for me regexp is only for same format's same length's string (i'm probably wrong)"
I can't see anything in the regular expression that restricts it to "same length's string" as you write, it adapts exaclty to the the length of the leading name, with any number of 'string_' repetitions.
When I run it on all of your example strings, I get the expected outputs, e.g.:
>> Eq = 'A_number_1+((A_number_2+A_number_3)&(A_number_4+A_number_5))+A_number_6';
>> Fq = regexprep(Eq, '^(\w+?)(\d+\W.*)', '$1\(${strrep($2,$1,'''')}\)')
Fq =
A_number_(1+((2+3)&(4+5))+6)
and
>> Eq = 'string_string_number_1+((string_string_number_2+string_string_number_3)&(string_string_number_4+string_string_number_5))+string_string_number_6';
>> Fq = regexprep(Eq, '^(\w+?)(\d+\W.*)', '$1\(${strrep($2,$1,'''')}\)')
Fq =
string_string_number_(1+((2+3)&(4+5))+6)
etc. etc.
Please show me a single example of my code that does not give the expected output, so that I can check it.
Lucas S
Lucas S el 28 de Feb. de 2020
Hmmm ok i'm dumb i tried with your 1st post thinking it was the new one ... My bad thank you !

Iniciar sesión para comentar.

Más respuestas (0)

Categorías

Más información sobre Data Type Conversion en Centro de ayuda y File Exchange.

Preguntada:

el 27 de Feb. de 2020

Comentada:

el 28 de Feb. de 2020

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by