How can i regexp this sample text?
7 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
im looking to grab all the data after "CHANNEL" into a cell array such that C{1,1} = Shell Angle and so on
and looking to grab all the data after "UNITS" into a separate cell array such that D{1:1} = deg and so on
Im having trouble with the non alphanumeric stuff such as @, (, ), etc
Any help greatly appreciated, Thanks!
0 comentarios
Respuestas (2)
Stephen23
el 28 de Mzo. de 2019
Editada: Stephen23
el 28 de Mzo. de 2019
Here is a regexp-based solution which does NOT use hard-coded variable names (it automatically detects the variable names), and does NOT magically create variables in the MATLAB workspace... read this to know why dynamically naming variables is a bad way to write code:
The code is quite straighforward: the first regexp identifies the variable name and square bracket pairs, the second regexp identifies the list values themselves:
str = fileread('sample2.txt');
tkn = regexp(str,'(\w+)\s+=\s+\[([^\]]+)\]','tokens');
tkn = vertcat(tkn{:});
hdr = tkn(:,1);
tkn = regexp(tkn(:,2),'''([^'']+)''','tokens');
tkn = cellfun(@(c)[c{:}],tkn,'uni',0);
And checking:
>> hdr{2}
ans =
UNIT
>> tkn{2}
ans =
Columns 1 through 8
'deg' 'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)'
Columns 9 through 16
'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)'
Columns 17 through 20
'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)'
You can access the data directly from the cell arrays, or use them to define a convenient structure:
>> S = cell2struct(tkn,hdr,1);
>> S.UNIT
ans =
Columns 1 through 8
'deg' 'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)'
Columns 9 through 16
'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)'
Columns 17 through 20
'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)' 'lbf/(ft.s)'
>> S.CHANNEL
ans =
Columns 1 through 8
'Shell Angle' [1x30 char] [1x30 char] [1x30 char] [1x30 char] [1x30 char] [1x30 char] [1x30 char]
Columns 9 through 16
[1x30 char] [1x30 char] [1x31 char] [1x31 char] [1x31 char] [1x31 char] [1x31 char] [1x31 char]
Columns 17 through 20
[1x31 char] [1x31 char] [1x31 char] [1x31 char]
2 comentarios
Guillaume
el 28 de Mzo. de 2019
Editada: Guillaume
el 28 de Mzo. de 2019
>> map = containers.Map(hdr, tkn);
>> map('UNIT')
ans =
1×20 cell array
Columns 1 through 3
{'deg'} {'lbf/(ft.s)'} {'lbf/(ft.s)'}
Columns 4 through 6
{'lbf/(ft.s)'} {'lbf/(ft.s)'} {'lbf/(ft.s)'}
Columns 7 through 9
{'lbf/(ft.s)'} {'lbf/(ft.s)'} {'lbf/(ft.s)'}
Columns 10 through 12
{'lbf/(ft.s)'} {'lbf/(ft.s)'} {'lbf/(ft.s)'}
Columns 13 through 15
{'lbf/(ft.s)'} {'lbf/(ft.s)'} {'lbf/(ft.s)'}
Columns 16 through 18
{'lbf/(ft.s)'} {'lbf/(ft.s)'} {'lbf/(ft.s)'}
Columns 19 through 20
{'lbf/(ft.s)'} {'lbf/(ft.s)'}
Whatever you do, do not use values coming out of a text file to name actual variables. Creating variables dynamically is a sure way to have bugs, you may well overwrite an existing variable if for some reason the input file is slightly different.
Adam Danz
el 21 de Mzo. de 2019
Editada: Adam Danz
el 28 de Mzo. de 2019
In this solution, the file is first read into matlab as a char array. Then I segregate the "CHANNEL" variable and convert it into Matlab syntax so it can be converted to a cell array. The same is then done with the "UNIT" variable.
file = 'C:\Users\name\Documents\MATLAB\sample2.txt';
c = fileread(file);
% segregate "Channel" array
chanTxt = regexp(c,'CHANNEL.+?\]', 'match');
% Replace ampersands with elipses
chanTxt = strrep(chanTxt, '&', '...');
% Replace square bracket with curly brackets
chanTxt = strrep(chanTxt, '[', '{');
chanTxt = strrep(chanTxt, ']', '}');
% Execute string to convert it to cell array of strings
eval(chanTxt{:}); % Now you have a variable named "CHANNEL"
% CHANNEL' % (first 5 lines)
% ans =
% 20×1 cell array
% {'Shell Angle' }
% {'Surface Heat Flux @ GridLine=1' }
% {'Surface Heat Flux @ GridLine=2' }
% {'Surface Heat Flux @ GridLine=3' }
% {'Surface Heat Flux @ GridLine=4' }
% {'Surface Heat Flux @ GridLine=5' }
% Now go through the same process for "UNIT"
unitTxt = regexp(c,'UNIT.+?\]', 'match');
unitTxt = strrep(unitTxt, '&', '...');
unitTxt = strrep(unitTxt, '[', '{');
unitTxt = strrep(unitTxt, ']', '}');
eval(unitTxt{:}); % Now you have a variable named "UNIT"
A great work station to develop regular expressions: https://regex101.com/
0 comentarios
Ver también
Categorías
Más información sobre Multirate Signal Processing en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!