Using Textscan on non-uniform data

1 visualización (últimos 30 días)
Russell Nasrallah
Russell Nasrallah el 18 de Jun. de 2019
Editada: per isakson el 21 de Jun. de 2019
Hello all,
I am currently trying to format outputs from a fortran code into CSV using the textscan function in Matlab. My outputs from the fortran code have a semi-uniform output, but it can change depends on the number of nodes the user requests.
In this example, the user has specified 12 nodes and the text files looks like the following:
Interpolated values at stations
168 12
867600.0000000000 % this is the time step at which the values at the following 12 nodes is found.
1 0.3170054495E+03
2 0.2983347787E+03
3 0.2857833907E+03
4 0.2825696256E+03
5 0.2795692154E+03
6 0.2806315572E+03
7 0.2811630597E+03
8 0.2814156663E+03
9 0.2814718273E+03
10 0.2811785316E+03
11 0.2807765370E+03
12 0.2798665405E+03
871200.0000000000
1 0.3042805523E+03
2 0.3033600277E+03
3 0.2913505094E+03
4 0.2790455081E+03
5 0.2709832029E+03
6 0.2680434294E+03
7 0.2677295494E+03
8 0.2684905990E+03
9 0.2690373464E+03
10 0.2696588011E+03
11 0.2699294457E+03
12 0.2697688946E+03
Currently, I have textscan skipping the first two lines. My final output goal would looks somethign like the following:
timestep 1, node 1, node 2, ..., node 11, node 12
timestep 2, node 1, node 2, ..., node 11, node 12
Currently, I would like the code to be smart enough to tell the number of nodes that the user supplied (provided in the second line of the above text), and also be able to distinguish between the timestep lines and the node lines.
Any suggestions?
I've attached a example of one of my text files.
  1 comentario
per isakson
per isakson el 18 de Jun. de 2019
Search Answers for read text tag:block in the search field in the upper right corner.

Iniciar sesión para comentar.

Respuesta aceptada

per isakson
per isakson el 19 de Jun. de 2019
Editada: per isakson el 19 de Jun. de 2019
An exercise with fscanf()
%%
ffs = "HS_full_18md_nam_outputs.txt";
fid = fopen( ffs, 'r' );
[~] = fgetl( fid );
num = fscanf( fid, '%d%d', [2,1] );
buf = fscanf( fid, ['%f', repmat('%*d%f', 1,num(2) ) ], [num(2)+1,inf] );
[~] = fclose( fid );
out = permute( buf, [2,1] );
peek on the result
>> out(1:3,1:6)
ans =
8.676e+05 1.9204e-05 3.981e-05 5.6839e-05 7.3688e-05 9.2944e-05
8.712e+05 2.2396e-05 4.0073e-05 5.1601e-05 6.1428e-05 7.487e-05
8.748e+05 1.9849e-05 3.1175e-05 4.2591e-05 5.3355e-05 6.5603e-05
>>
  6 comentarios
Russell Nasrallah
Russell Nasrallah el 20 de Jun. de 2019
Thanks for the response from both of you again.
Per,
When you say "In the code that will be used one month from now..." what do you mean? Is there an update coming that is going to modify the fopen function?
per isakson
per isakson el 21 de Jun. de 2019
Editada: per isakson el 21 de Jun. de 2019
"In the code that will be used one month from now..."
I try to say that there are two types of code regarding error handling:
  • Small scripts/functions that you yourself use a few times during a short period of time. In this case it might be ok to skip error checking. Matlab will show more or less relevant error messages at lines several lines "too late".
  • Scripts/functions that will be used over a longer period of time. In this case error handling with good messages can help find the real cause of the problem quickly.

Iniciar sesión para comentar.

Más respuestas (1)

Walter Roberson
Walter Roberson el 18 de Jun. de 2019
fid = fopen('HS_full_18md_nam_outputs.txt');
fgets(fid); %skip header
ctl = fscanf(fid, '%f%f', 2);
Nt = ctl(1);
Ns = ctl(2);
data = zeros(Nt, Ns+1);
for ts = 1 : Nt
timestep = fscanf(fid, '%f', 1);
thisdata = cell2mat(textscan(fid, '%*f%f', Ns));
data(ts, 1) = timestep;
data(ts, 2:end) = thisdata;
end
fclose(fid);
  4 comentarios
Walter Roberson
Walter Roberson el 19 de Jun. de 2019
fscanf and textscan both stop when the size inputs have been satisfied, leaving the input buffer position immediately after the last character that was consumed. That might be in the middle of a line.
Neither function specifically processes line by line. Instead, unless you use uncommon options, both ignore leading whitespace including line boundaries. If for example you ask for 3 numbers then neither function cares whether the input is
1 2 3
Or
1
2 3
(note the empty line on input)
There is a difference between the two though. For fscanf the count you provide is the total number of values to read. For textscan the count is the number of times to repeat the format. In cases where a format describes an entire line then typically that can be interpreted as the number of lines to read (not entirely accurate if the values are not in the expected format)
Russell Nasrallah
Russell Nasrallah el 20 de Jun. de 2019
Thank you for this clear and concise answer, Walter! I totally understand these tools much better now.

Iniciar sesión para comentar.

Categorías

Más información sobre Data Import and Export en Help Center y File Exchange.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by