Faster way of appending data to large struct array

20 visualizaciones (últimos 30 días)
Peter Manovski
Peter Manovski el 27 de Mzo. de 2023
Editada: Peter Manovski el 27 de Mzo. de 2023
I have 700k files of velocities (vx), each file is an array of size x (16) and y (77). Here is my code to create a large struct array so that I can access the data and then peform statistics at each x and y location. At the end I want to make for each x and y location a time series of the data. So at each x and y I will have the vx (velocity) for all 700k time instances.
I have read many recommendations on here that a struct or cell array are not efficient and fast in storing or accessing data. The recommendation is to use single arrays or vectors. But I don't know how to implement for my case. The bottle neck is the growing size of the array. For 7k samples it take 151 seconds but for the real 700k data set it takes days to process!
Can someone please recommend a speed improvement to the following:
ref_x = 16;
ref_y = 77;
NoFiles = 7000; % but for real case I have 700000
stats = [];
%preallocate struct
stats = struct('vx',zeros(NoFiles, ref_x, ref_y));
tic
for i = 1: NoFiles
% code that loads file
% each new file has new.vx data
new.vx = rand(16,77);
for k = 1: ref_x
if i == 1
stats(k).vx = new.vx(k,:);
else
stats(k).vx = cat(1,stats(k).vx,new.vx(k,:));
end
end
end
toc
Elapsed time is 37.361091 seconds.
% more code here to reshape arrays in desired format.

Respuesta aceptada

Matt J
Matt J el 27 de Mzo. de 2023
Editada: Matt J el 27 de Mzo. de 2023
That is a quite a bit of data. Even in single precision, the array will consume ~3 GB. Ths should be faster, though.
stats=nan(ref_x,ref_y,No_files,'single');
tic
for i = 1: NoFiles
% code that loads file
% each new file has new.vx data
stats(:,:,i) = new.vx;
end
toc
  3 comentarios
Matt J
Matt J el 27 de Mzo. de 2023
Seems like th edifference is non-impactful in this context.
N=3000;
E=rand(100);
stats=nan([size(E),N],'single');
tic
for i = 1: N
% code that loads file
% each new file has new.vx data
stats(:,:,i) = E;
end
toc
Elapsed time is 0.045564 seconds.
stats=inf([size(E),N],'single');
tic
for i = 1: N
% code that loads file
% each new file has new.vx data
stats(:,:,i) = E;
end
toc
Elapsed time is 0.047948 seconds.
Peter Manovski
Peter Manovski el 27 de Mzo. de 2023
Editada: Peter Manovski el 27 de Mzo. de 2023
Thank you that works way better! and yes, high speed camera can generate a lot of data quickly!
in my case, I prefer NaN because when I do statistics zero or other values can bias data.

Iniciar sesión para comentar.

Más respuestas (0)

Categorías

Más información sobre Creating and Concatenating Matrices en Help Center y File Exchange.

Productos


Versión

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by