Borrar filtros
Borrar filtros

load specified column in matfile too slow

2 visualizaciones (últimos 30 días)
Yu Li
Yu Li el 3 de Oct. de 2018
Comentada: Walter Roberson el 3 de Oct. de 2018
I have a matfile with size of: 5e6*50.
I want to write a code to load the specific column into my memory, but I found that the time reading a specified column is nearly the same with that reading the whole matfile.
below is the test code:
A=rand(5e6,50);
save A A
f=matfile('A');
tic
tmp=f.A;
toc
tic
tmp=f.A(:,1);
toc
is there anyway to improve the performance?
Thanks!
Yu
  2 comentarios
Walter Roberson
Walter Roberson el 3 de Oct. de 2018
I notice you did not specifically save with -v7.3, so you might be getting a -v7 file.
When I test on my system with -v7.3, selecting one column comes out roughly 10% faster. Not as good as one might hope, though.
Yu Li
Yu Li el 3 de Oct. de 2018
Yes, the expected result should be 2% of reading the whole file, since I have totally 50 columns.

Iniciar sesión para comentar.

Respuestas (1)

Walter Roberson
Walter Roberson el 3 de Oct. de 2018
In this case, you can do much better by using -nocompression
Save -v7.3 -nocompression
Elapsed time is 17.814604 seconds.
Done save -v7.3 -nocompression
Start matfile -v7.3 -nocompression
Elapsed time is 0.016278 seconds.
Done matfile -v7.3 -nocompression
Start recall entire variable -v7.3 -nocompression
Elapsed time is 2.195975 seconds.
Done recall entire variable -v7.3 -nocompression
Start recall one column -v7.3 -nocompression
Elapsed time is 1.089280 seconds.
Done recall one column -v7.3 -nocompression
Save -v7.3
Elapsed time is 58.543461 seconds.
Done save -v7.3
Start matfile -v7.3
Elapsed time is 0.077814 seconds.
Done matfile -v7.3
Start recall entire variable -v7.3
Elapsed time is 10.139135 seconds.
Done recall entire variable -v7.3
Start recall one column -v7.3
Elapsed time is 9.118167 seconds.
Done recall one column -v7.3
Source code:
A=rand(5e6,50);
time_it(A, {'-v7.3' '-nocompression'})
time_it(A, {'-v7.3'})
function time_it(A, saveoptions)
savedesc = strjoin(saveoptions, ' ');
fprintf('Save %s\n', savedesc);
tic
save('A', 'A', saveoptions{:});
toc
fprintf('Done save %s\n', savedesc);
fprintf('Start matfile %s\n', savedesc);
tic
f = matfile('A');
toc
fprintf('Done matfile %s\n', savedesc);
fprintf('Start recall entire variable %s\n', savedesc);
tic
tmp=f.A;
toc
fprintf('Done recall entire variable %s\n', savedesc);
fprintf('Start recall one column %s\n', savedesc);
tic
tmp=f.A(:,1);
toc
fprintf('Done recall one column %s\n', savedesc);
end
  7 comentarios
Yu Li
Yu Li el 3 de Oct. de 2018
Hi:
Thanks for your test, I got much deeper understanding about this.
I think there should have some bottom neck here, I will contact Mathworks for further investigation.
Thanks!
Yu
Walter Roberson
Walter Roberson el 3 de Oct. de 2018
You can refer them to this post and my test code.
One thing they are likely to point out is that the default of compression is intended for "real" data, not for rand(), and that when you read/write with compression, the performance would be expected to vary with how compressible the data is. Thus you should probably run this code with the rand() replaced by load of one of your actual matrices.

Iniciar sesión para comentar.

Categorías

Más información sobre Data Type Conversion en Help Center y File Exchange.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by