single vs multiple fprintf() efficiency

8 visualizaciones (últimos 30 días)
Haider Ali
Haider Ali el 7 de Feb. de 2018
Comentada: Haider Ali el 7 de Feb. de 2018
Hi,
I want to write some text data (integer and string types) in a .txt file using fprintf() and I am looking for the most efficient way of doing so. I was under the impression that using a single fprintf() instead of multiple would be efficient but that is not the case as evident from the codes below.
This first approach takes approx 4 seconds to write.
str1 = "str1";
str2 = "str2str2";
str3 = "str3str3str3";
str4 = "str4str4str4str4";
str5 = "str5str5str5str5str5";
formatSpec = '%d%s%s%s%s%s%04X\n';
fid = fopen("post.txt", 'Wt');
tic
for index = 1:10000
fprintf(fid, formatSpec, index, str1, str2, str3, str4, str5, index);
end
toc
fclose(fid);
This second approach takes approx 1 sec to write although it has multiple calls to fprintf().
str1 = "str1";
str2 = "str2str2";
str3 = "str3str3str3";
str4 = "str4str4str4str4";
str5 = "str5str5str5str5str5";
str_newline = "\n";
fid = fopen("post.txt", 'Wt');
tic
for index = 1:10000
fprintf(fid, '%d', index);
fprintf(fid, str1);
fprintf(fid, str2);
fprintf(fid, str3);
fprintf(fid, str4);
fprintf(fid, str5);
fprintf(fid, '%04X', index);
fprintf(fid, str_newline);
end
toc
fclose(fid);
I am aware that I am not specifying the format in the middle calls. But the file written is the same in both cases.
  1. What could be the reason of this behavior?
  2. What can I do to skip the format of some of the fields (e.g. string fields) in formatSpec option in approach 1?
  3. How to specify the formatSpec option if you want to pass string arrays instead of single strings (I get an error). If I want to write all the strings data in one go?
  4. What would be the most efficient way to write data in my use case (integer + string type field)?
Thanks.
  2 comentarios
Walter Roberson
Walter Roberson el 7 de Feb. de 2018
"I am aware that I am not specifying the format in the middle calls. But the file written is the same in both cases."
... but would not be if the strings contained any % or \ characters.
Haider Ali
Haider Ali el 7 de Feb. de 2018
@Walter, Lets assume that no such characters appear in my use case.

Iniciar sesión para comentar.

Respuestas (2)

Jos (10584)
Jos (10584) el 7 de Feb. de 2018
You can skip the format identifier for string inputs, simply because fprintf uses the format identifier to transform its inputs into strings. No need to do that when the inputs are strings already ...
So, use fprintf(fid,'%s',str#) for a fair comparison regarding timings. You'll probably see that the second code will run slower :)
  4 comentarios
Walter Roberson
Walter Roberson el 7 de Feb. de 2018
I wonder how the timing changes if you were using character vectors instead of string objects ?
Haider Ali
Haider Ali el 7 de Feb. de 2018
@Walter, I just changed the strings to character vectors and it has reduced the time to approx half (0.5 seconds) using first approach but using second approach with character vectors still takes approx 1 second. So I think I will go with the first approach along with character vectors and then use arrays of those character vectors to eliminate the for loop.

Iniciar sesión para comentar.


Jan
Jan el 7 de Feb. de 2018
Editada: Jan el 7 de Feb. de 2018
Replace
fprintf(fid, str1);
by
fwrite(fid, str, 'char')
to avoid that fprintf tries to parse the string.
You can omit the format specifier for static strings, by parsing them outside the loop:
str1 = 'str1';
str2 = 'str2str2';
str3 = 'str3str3str3';
str4 = 'str4str4str4str4';
str5 = 'str5str5str5str5str5';
newline = char([13, 10]);
fid = fopen('post.txt', 'W');
formatSpec2 = '%d%s%s%s%s%s%%04X\n'; % See the '%%04x'
formatSpec = [sprintf(formatSpec2, str1, str2, trs3, str4, str5), newline];
fprintf(fid, formatSpec, 1:10000);
fclose(fid);
The idea is: Create the fixed parts once only. With your example code, you can even omit the loop.
The text mode requires to detect the line breaks. In binary mode and with hard coded line breaks, the code should be faster and does not depend on the platform.
Remember that the timing of file access depends on the operating system and hard disk also: There are caches in the OS and on the disk. Creating a file twice by the same method can need very different times.
  1 comentario
Haider Ali
Haider Ali el 7 de Feb. de 2018
@Jan Simon, what if the strings are not fixed/static. I am attaching a template of the file (only a portion) to be written herewith. Currently I am writing the file one field (Index, Message Format, Terminal Address and so on-->see the attached file) at a time and one line in one iteration of loop. What I want to do is make arrays of each field/column and then write all those arrays using a single fprintf() statement to increase the writing speed. Can you please have a look at the attached file and suggest something? Thanks

Iniciar sesión para comentar.

Categorías

Más información sobre Characters and Strings en Help Center y File Exchange.

Productos

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by