single vs multiple fprintf() efficiency
8 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
Hi,
I want to write some text data (integer and string types) in a .txt file using fprintf() and I am looking for the most efficient way of doing so. I was under the impression that using a single fprintf() instead of multiple would be efficient but that is not the case as evident from the codes below.
This first approach takes approx 4 seconds to write.
str1 = "str1";
str2 = "str2str2";
str3 = "str3str3str3";
str4 = "str4str4str4str4";
str5 = "str5str5str5str5str5";
formatSpec = '%d%s%s%s%s%s%04X\n';
fid = fopen("post.txt", 'Wt');
tic
for index = 1:10000
fprintf(fid, formatSpec, index, str1, str2, str3, str4, str5, index);
end
toc
fclose(fid);
This second approach takes approx 1 sec to write although it has multiple calls to fprintf().
str1 = "str1";
str2 = "str2str2";
str3 = "str3str3str3";
str4 = "str4str4str4str4";
str5 = "str5str5str5str5str5";
str_newline = "\n";
fid = fopen("post.txt", 'Wt');
tic
for index = 1:10000
fprintf(fid, '%d', index);
fprintf(fid, str1);
fprintf(fid, str2);
fprintf(fid, str3);
fprintf(fid, str4);
fprintf(fid, str5);
fprintf(fid, '%04X', index);
fprintf(fid, str_newline);
end
toc
fclose(fid);
I am aware that I am not specifying the format in the middle calls. But the file written is the same in both cases.
- What could be the reason of this behavior?
- What can I do to skip the format of some of the fields (e.g. string fields) in formatSpec option in approach 1?
- How to specify the formatSpec option if you want to pass string arrays instead of single strings (I get an error). If I want to write all the strings data in one go?
- What would be the most efficient way to write data in my use case (integer + string type field)?
Thanks.
2 comentarios
Walter Roberson
el 7 de Feb. de 2018
"I am aware that I am not specifying the format in the middle calls. But the file written is the same in both cases."
... but would not be if the strings contained any % or \ characters.
Respuestas (2)
Jos (10584)
el 7 de Feb. de 2018
You can skip the format identifier for string inputs, simply because fprintf uses the format identifier to transform its inputs into strings. No need to do that when the inputs are strings already ...
So, use fprintf(fid,'%s',str#) for a fair comparison regarding timings. You'll probably see that the second code will run slower :)
4 comentarios
Walter Roberson
el 7 de Feb. de 2018
I wonder how the timing changes if you were using character vectors instead of string objects ?
Jan
el 7 de Feb. de 2018
Editada: Jan
el 7 de Feb. de 2018
Replace
fprintf(fid, str1);
by
fwrite(fid, str, 'char')
to avoid that fprintf tries to parse the string.
You can omit the format specifier for static strings, by parsing them outside the loop:
str1 = 'str1';
str2 = 'str2str2';
str3 = 'str3str3str3';
str4 = 'str4str4str4str4';
str5 = 'str5str5str5str5str5';
newline = char([13, 10]);
fid = fopen('post.txt', 'W');
formatSpec2 = '%d%s%s%s%s%s%%04X\n'; % See the '%%04x'
formatSpec = [sprintf(formatSpec2, str1, str2, trs3, str4, str5), newline];
fprintf(fid, formatSpec, 1:10000);
fclose(fid);
The idea is: Create the fixed parts once only. With your example code, you can even omit the loop.
The text mode requires to detect the line breaks. In binary mode and with hard coded line breaks, the code should be faster and does not depend on the platform.
Remember that the timing of file access depends on the operating system and hard disk also: There are caches in the OS and on the disk. Creating a file twice by the same method can need very different times.
Ver también
Categorías
Más información sobre Characters and Strings en Help Center y File Exchange.
Productos
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!