How to read specific data from text file between 2 lines
Mostrar comentarios más antiguos
Hello,
I have the attached text file for which I would like to accomplish the following: The data is formatted as seen below and I would like to extract and plot only the numbers between the SP_0 through the SP_19 tag

I have tried looping through the file and using the cell2mat function once the "SETPOINT" tag is found but am not having any luck. Any help would be greatly appreciated!
while ~feof(fid)
lineData = fgetl(fid); % read a line
if strfind(lineData,'SETPOINT'), break, end % found the first 'SETPOINT' so quit
end
data=cell2mat(textscan(fid,repmat('%d',1,1),'collectoutput',1));
Respuestas (1)
The proper way to do this: Your file is not a text file but an xml file. Use xmlread and navigate the DOM or the FileExchange xml2struct if navigating the DOM is too complicated. The code would be something like this:
xmltree = xml2struct('pathtothefile');
setpoints = xmltree.RECIPE.SETPOINTS;
desiredsetpoints = arrayfun(@(n) str2double(setpoints.(sprintf('SP_%d', n)).Text), 0:19);
The cheap way to do it is to use a regular expression to extract the setpoints. It'll be faster but can break in all sort of interesting ways if something else in the file happens to match the regex.
filecontent = fileread('pathtothefile');
desiredsetpoints = str2double(regexp(filecontent, '(?<=<SP_1?[0-9]>)\d+', 'match'))
The regexp also doesn't check that the setpoints are in the right order. The order of the tags in an XML file is absolutely not guaranteed, so use at your own risks.
5 comentarios
Michael Lopez
el 12 de Oct. de 2018
Michael Lopez
el 12 de Oct. de 2018
Editada: Michael Lopez
el 12 de Oct. de 2018
Guillaume
el 12 de Oct. de 2018
Yes, forgot to look at the text of the tag in the xml2struct version. Fixed now.
Or if I use the regexp method, how can I specify to only read in the values between the recipe tags?
To do it sort of safely, you'd have to do it in two step, one regexp to extract the content of the recipe tag and another one to parse that content. Regexes are not recommended for parsing xml/html content. It's too easy to break them or they become very complicated if you want them foolproof.
Using a parser designed for the format is a lot safer, so I would really recommend you use the first option.
Michael Lopez
el 12 de Oct. de 2018
Michael Lopez
el 13 de Oct. de 2018
Categorías
Más información sobre Structured Data and XML Documents en Centro de ayuda y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!