How to extract data from table variable names?

MIM Maestro puts volume data inside the table variable names:
Rectum (4)(Volume: 57.77) Bladder (5)(Volume: 139.40)
How do we extract this volume data?

Respuestas (2)

Daniel Bridges
Daniel Bridges el 14 de Mzo. de 2018
This method works, but I suspect there is a more elegant solution.
% get list of variable names
opts = detectImportOptions(filepath, 'NumHeaderLines', 1);
% we find the column containing the rectum volume
RectumSearching = regexp(opts.VariableNames,'Rectum_');
for loop = 1:numel(opts.VariableNames)
if RectumSearching{loop} == 1
index = loop;
end
end
% extract volume from said string
volume = extractAfter(opts.VariableNames{index},'Volume_');
volume = str2num(strrep(volume(1:end-1),'_','.'))
Result:
volume =
57.7700
Stephen23
Stephen23 el 14 de Mzo. de 2018
Editada: Stephen23 el 14 de Mzo. de 2018
>> C = {'Rectum (4)(Volume: 57.77)','Bladder (5)(Volume: 139.40)'};
>> str2double(regexp(C,'\d+(\.\d+)?(?=\)$)','once','match'))
ans =
57.770 139.400
Or to require the preceding 'Volume' substring:
str2double(regexp(C,'(?<=Volume: )\d+(\.\d+)?(?=\)$)','once','match'))

6 comentarios

Daniel Bridges
Daniel Bridges el 14 de Mzo. de 2018
Editada: Daniel Bridges el 14 de Mzo. de 2018
Looking at the regexp documentation I have not yet been able to interpret your expression. Using that code results in NaN, unfortunately.
Namely, replacing
volume = extractAfter(opts.VariableNames{index},'Volume_');
volume = sscanf(strrep(volume,'_','.'),'%f');
with
volume = str2double(regexp(opts.VariableNames{index},'\d+(\.\d+)?(?=\)$)','once','match'));
results in a 40x1 double array of NaN, without any time saved.
(My initial thought upon seeing this answer: It seems once someone learns regular expressions one should be given an honorary degree in Computer Science ... perhaps a Bachelor's ...)
Stephen23
Stephen23 el 14 de Mzo. de 2018
Editada: Stephen23 el 14 de Mzo. de 2018
@Daniel Bridges: please upload opts in a .mat file.
Daniel Bridges
Daniel Bridges el 14 de Mzo. de 2018
Here you are.
@Daniel Bridges: sorry, my old MATLAB can't read that object type. Please upload this .mat file:
vn = opts.VariableNames;
save('varnames.mat','vn')
Daniel Bridges
Daniel Bridges el 15 de Mzo. de 2018
Okay.
Stephen23
Stephen23 el 15 de Mzo. de 2018
Editada: Stephen23 el 15 de Mzo. de 2018
@Daniel Bridges: thank you for uploading that .mat file. The char vectors in that cell array have a different format to the one that you showed in your question, apparently with the parentheses and decimal point replaced by underscores. You can easily process this by first replacing the underscore with period characters:
>> S = load('varnames.mat');
>> C = strrep(S.vn,'_','.');
>> str2double(regexp(C,'\d+(\.\d+)?(?=\.$)','once','match'))
ans =
Columns 1 through 7
NaN 18494 27.49 4.9 57.77 139.4 479.8
Columns 8 through 14
1.08 29.15 95.8 97.26 72.3 93.45 80.69
Columns 15 through 16
83.06 32.39
As an alternative you could skip using detectImportOptions (which I guess makes these character replacements) and read the header lines using fgetl. This line could then be trivially process by a similar regular expression to the one I showed you.

Iniciar sesión para comentar.

Categorías

Más información sobre Data Distribution Plots en Centro de ayuda y File Exchange.

Preguntada:

el 14 de Mzo. de 2018

Editada:

el 15 de Mzo. de 2018

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by