Question about optimizing reading data from text file

Question

0 votos

Hello, thanks for reading this,

I currently have a reader that reads in mesh files, and it works, but depending on the size of the file it can take a very long time. I was hoping I can optimize it for speed.

What I do first is read in a text file and change every line into a matrix of characters using the lines:

   cac = textscan( fid, '%[^\n]' );
   fclose(fid);
   A  = char( cac{1} );

where A is my character matrix. I then search through the text file for identifiers for data I need. How I accomplish this is by setting start of data indices and end of data indices. I basically read this line by line, and at the moment, I assume it will always be formatted in a certain way.

After I have these indices, I use sscanf functions to read the characters as %f or %x numbers and store them into matrices. This is the part where the profiler says it takes the longest to complete.

I posted the MATLAB reader function here: http://pastebin.com/FFtgXzg4, since it is a bit long to post here. My specific questions are: do I have to convert the whole text import into a character matrix, and is there any way I can do this without needing a for loop? The loops using sscanf take a very long time.

It works, but just barely so. I can send a test import file if needed.

1 comentario
Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

Cedric el 24 de Mayo de 2013

Could you post e.g. 20 lines of your data file, and define these identifiers that are are referring to?

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Follow Question

Answer 1

Jonathan Sullivan el 23 de Mayo de 2013

Editada: Jonathan Sullivan el 23 de Mayo de 2013

Abrir en MATLAB Online

0 votos

You may want to use fread and regexp.

Without seeing your file, I can't say for sure this will produce the same result, but it should give you a good starting point.

% Using regexp and fread
fid = fopen(filename,'r');
tic;
A = regexp(fread(fid,'*char')','\n','split');
A = char( A{:} );
toc
fclose(fid);
% Using textscan
fid = fopen(filename,'r');
tic;
B = textscan(fid,'%[^\n]');
B2 = char(B{1});
toc
fclose(fid);

1 comentario
Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

Brian el 23 de Mayo de 2013

It seems that the text scan I have goes slightly faster than the regexp/fread combination. There is one last part of the code that seems to be giving me problems:

When I have my start and end indices, I use sscanf line by line to give me the real data I need. However, some of my character matrices can be very large: sometimes spanning hundreds of thousands of rows (depending on the number of tetrahedra I have).

Is it possible to read this in any kind of intelligent fashion using sscanf line by line, or use it as a vector component, or should I look into exporting the matrix to a formatted text file and re-importing it using textread and hex2dec?

In these areas, I will always have the following combination of characters:

xxx xxx xxxx x x,

where I believe it can be split by a space delimiter. That leaves me with five hexadecimal values per row.

Iniciar sesión para comentar.

Question about optimizing reading data from text file

1 comentario
Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

Respuestas (1)

1 comentario
Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

Categorías

Etiquetas

Community Treasure Hunt

Question about optimizing reading data from text file

1 comentario Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

Respuestas (1)

1 comentario Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

Categorías

Etiquetas

Ver también

Community Treasure Hunt

1 comentario
Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

1 comentario
Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos