For a big matrix, how to accelerate fprintf?

11 visualizaciones (últimos 30 días)
Tian
Tian el 10 de En. de 2017
Comentada: Scott Campbell el 7 de Dic. de 2022
Hello everyone, I have a 2500*1500 matrix and I want to print every column to a txt file, 5 numbers every row. Using :
for i=1:1500,
fprintf(fid, 'This is the %d coefficients\n', i);
S=sprintf(' %15.8E %15.8E %15.8E %15.8E %15.8E\n', coeff(:, i));
S(S=='E')='D';
fprintf(fid, '%s', S);
end
it will take several seconds. I'd like to know how can I accelerate this?
  3 comentarios
Tian
Tian el 10 de En. de 2017
Editada: Tian el 10 de En. de 2017
Appologize. I miss a '\n' in the first fprintf.
Actually I am constructing a formatted file that has already been accepted by many softwares, I have to add a Title line 'This is the %d coefficients' (just as an example), before printing each coeff(:,i).
Tian
Tian el 10 de En. de 2017
Editada: Tian el 10 de En. de 2017
By 'writing in binary', do you mean use fprintf(fid, '%s', double(S)); instead of fprintf(fid, '%s', S);?
I just tried this and find that using fprintf(fid, '%s', double(S)); spent more than doubled time.
If I use fopen('test.txt', 'wb') instead of fopen('test.txt', 'w'), the time required is the same.
If I misunderstood your suggestion, please let me know. Thank you~

Iniciar sesión para comentar.

Respuesta aceptada

Walter Roberson
Walter Roberson el 10 de En. de 2017
You have a few different speed constraints
  • the speed of formatting individual numeric items, but you are already using the fastest way
  • the overhead of calling fprintf() and sprintf() multiple times, which could potentially be reduced by formatting everything at one time and then writing it all
  • the cost of doing the substitution of 'E' to 'D', which possibly could be done more efficient (but your current version looks pretty good as-is)
  • the overhead of doing the substitution multiple times, which could potentially be reduced by building the output matrix and then doing the substitution all at once.
  • the cost of writing to disk, which you cannot get away from (except to touch up the buffering strategy, perhaps, as Jan shows)
You are not calling sprintf() irresponsibly such as with just one value at a time, so it is not obvious that there is a lot of overhead that could be cut by formatting everything at once.
Formatting everything at once is possible, but it drives up your memory costs a fair bit, to the point where you have to question whether the memory allocation costs of the large arrays are going to exceed the savings in overhead of calling sprintf() less often. Especially when you make the adjustments needed for your not always having a multiple of 5 items per column to display.
My tests show that regexprep() is roughly 16 times slower than your existing S(S=='E')='D' so you probably would have difficulty being more efficient on that portion.
With you already having cut down on overheads, and being stuck with the numeric formatting time and the file I/O time, I think you are already approaching as fast as you can reasonably get for that output format.
  1 comentario
Tian
Tian el 11 de En. de 2017
Thanks a lot for your detailed explanation. That's very helpful.

Iniciar sesión para comentar.

Más respuestas (1)

Jan
Jan el 10 de En. de 2017
This could be slightly faster:
fid = fopen(FileName, 'W'); % Uppercase W for better buffering
if fid == -1
error('Cannot open file for writing: %s', FileName);
end
for i = 1:1500,
fprintf(fid, 'This is the %d coefficients\n', i);
S = sprintf(' %15.8E %15.8E %15.8E %15.8E %15.8E\n', coeff(:, i));
fwrite(fid, strrep(S, 'E', 'D'), 'char');
end
But I assume the bottleneck is the slow disk transfer. The 'W' can reduce this, using an SSD would be better.
  2 comentarios
Tian
Tian el 11 de En. de 2017
Thanks. I'd like to try your method
Scott Campbell
Scott Campbell el 7 de Dic. de 2022
My 15 Mb csv file went from 30 to 10 seconds.

Iniciar sesión para comentar.

Categorías

Más información sobre Startup and Shutdown en Help Center y File Exchange.

Etiquetas

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by