Removing specific characters from string in nested cells
Mostrar comentarios más antiguos
I have a series of strings which are contained within a nested cell array (because regexp loves to nest cells), and I would like to remove any non numeric or white space characters from them so that I can convert them to doubles, namely astrick.
I'm looking for the least painful way of removing any of these special characters from all strings. I do not have a sample file to attach, sorry, but I have dictated the shape of a sample array below.
X == 1x1 cell
X{1} == 1x1 cell (because regexp can't help itself apparently)
X{1}{1} = {'1234., ';'12.,* ';'1234., ','123.,* ',' 321.,* '};
12 comentarios
Stephen23
el 13 de Jun. de 2018
@ Bob Nbob: is this related to your earlier question?:
If so, it would probably be easier to fix the regular expression. Please upload a sample file that you want to get the data from.
Why not just
x = {'1234., ';'12.,* ';'1234., ';'123.,* ';' 321.,* '};
x = regexprep(x,'[^\d]','');
?
As mentioned by Stephen, it's probably easier to fix the regex used in your earlier question. I left a comment there.
Bob Thompson
el 13 de Jun. de 2018
Not the prettiest but does the job, try this:
[tokens,matches]=regexp(yourtext,'(COLUMN[1,3]=\s*)(\d*.?\d*)(?:\,\s*)(\d*.\d*)(?:\,\s*)(\d*.\d*)(?:\,\s*)(\d*.\d*)(?:\*?\,\s*)(\d*.\d*)(?:\*?\,\s*)(\d*.\d*)(?:\*?\,\s)','tokens','match');
tokens{1}:
1×7 cell array
{'COLUMN1= '} {'1.12'} {'2.23'} {'3.34'} {'4.45'} {'5.56'} {'6.67'}
tokens{2}:
1×7 cell array
{'COLUMN3= '} {'1.23'} {'0.34'} {'3.45'} {'5.78'} {'6.54'} {'8.23'}
Bob Thompson
el 13 de Jun. de 2018
OCDER
el 13 de Jun. de 2018
Would something like this work?
Str = 'COLUMN3= 1.23, 0.34, 3.45, 5.78*, 6.54*, 8.23, 2, -3., 24.*';
EqIdx = find(Str == '=', 1);
if ~isempty(EqIdx)
Num = str2double(regexp(Str(EqIdx+1:end), '\-?\d+\.?\d*', 'match'));
end
Bob Thompson
el 13 de Jun. de 2018
OCDER
el 13 de Jun. de 2018
Might need more information of the start-to-end issue you're having. How are you reading in the text file? With fileread or fgetl or textscan? If you use fgetl or textscan, then you can get each row of text and then get the one you want. If you're using fileread, then it's much harder.
FID = fopen('textfile.txt');
TXT = textscan(FID, '%s', 'Delimiter', '\n');
TXT = TXT{1};
fclose(FID);
Num = cell(size(TXT));
for f = 1:length(TXT)
Str = TXT{f};
if contains(Str, 'CONTAINS=') %Specify condition for line you want here
EqIdx = find(Str == '=', 1); %Example, you want values after "="???
Num{f} = str2double(regexp(Str(EqIdx+1:end), '\-?\d+\.?\d*', 'match'));
end
end
Bob Thompson
el 14 de Jun. de 2018
Editada: Bob Thompson
el 14 de Jun. de 2018
"I'm not really sure what the ':' from Paolo's comment is supposed to do, I don't see it anywhere in the regexp documentation..."
Open the documentation, then use ctrl+f to search the webpage for ?:
Bob Thompson
el 15 de Jun. de 2018
Stephen23
el 15 de Jun. de 2018
@Bob Nbob: you are right, it does not appear in the Mfile help. I notice that many other useful regular expression features also do not appear in the Mfile help: notably missing are dynamic expressions, lookaround operators, and named capture.
Both the inbuilt help and the page I linked to give a very useful introduction, and explain all features of regular expressions in MATLAB:
doc regexp
doc('Regular Expressions')
Respuesta aceptada
Más respuestas (1)
George Abrahams
el 30 de Dic. de 2022
The others are right to fix the root problem causing the tricky nested cell array. Having said that, for future reference, my deepreplace function on File Exchange / GitHub would have done exactly what you requested.
x = {{{'1234., ';'12.,* ';'1234., ';'123.,* ';' 321.,* '}}};
% Remove any character except for digits (0-9) and period (.)
match = regexpPattern('[^\d.]');
x = deepreplace(x,match,'');
% x = 1×1 cell array
% {1×1 cell}
% x{1} = 1×1 cell array
% {5×1 cell}
% x{1}{1} = 5×1 cell array
% {'1234.'}
% {'12.' }
% {'1234.'}
% {'12310'}
% {'321.' }
Categorías
Más información sobre Characters and Strings en Centro de ayuda y File Exchange.
Productos
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!