Borrar filtros
Borrar filtros

Big data percentile calculation

3 visualizaciones (últimos 30 días)
David Santos
David Santos el 8 de Ag. de 2019
Comentada: David Santos el 8 de Ag. de 2019
Hi,
I have a large set (30.000) of mat files each one of them containing a 4x1 cell array of 1483x2824 double, 4 matrix for each file ~= 30-40 MB
These are timeseries files representing simulations over 3 months.
I want to calculate the percentile of all this time series files but is too much memory for my computer because I need to load all the files, any clue on how to solve this? I'm working on a server with 20cores/40 threads and 256GB of memory.
I heard about this algorithm (P-square) but I couldn't find something similar inside matlab.
All the best

Respuestas (1)

Steven Lord
Steven Lord el 8 de Ag. de 2019
See some of the tools and techniques available in MATLAB for working with Big Data, data that's too big to fit in memory. Many functions are supported on tall arrays.
  2 comentarios
David Santos
David Santos el 8 de Ag. de 2019
Editada: David Santos el 8 de Ag. de 2019
Thanks!
What would you recommend if I want to convert my 4xcell array files in just one?
David Santos
David Santos el 8 de Ag. de 2019
Ok, I'm trying using a fileDatastore and tall arrays:
-After all definitions I have the tall array t:
function data=loadPrc(filename)
data=load(filename);
ind=strfind(filename,'/');
data=data.(strcat('l',filename(ind(end)+1:end-4-7)));
data=data{1};
end
ds=fileDatastore('matBorrame','ReadFcn',@loadPrc,'FileExtensions','.mat')
t=tall(ds)
t =
4×1 tall cell array
{1483×2824 double}
{1483×2824 double}
{1483×2824 double}
{1483×2824 double}
My problem is that now the prctile calculation gives a format error:
gather(prctile(t,90,3))
Evaluating tall expression using the Parallel Pool 'local':
- Pass 1 of 1: 0% complete
Evaluation 0% complete
Error using tall/prctile (line 48)
Argument 1 to PRCTILE must be one of the following data types: numeric.
Learn more about errors encountered during GATHER.
That's because t should be in the format (1483x2824x4) but I can't reshape or permute a tall array, any clue on how to solve this¿?
All the best

Iniciar sesión para comentar.

Categorías

Más información sobre Descriptive Statistics en Help Center y File Exchange.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by