Building Distribute​d/Codistri​buted Array with .mat files

6 visualizaciones (últimos 30 días)
John Smith
John Smith el 21 de Nov. de 2017
Respondida: Oli Tissot el 13 de Ag. de 2020
I'm trying to run a truncated SVD on a very large (wide) dataset. The dataset will be ~1000x100000, and broken up into multiple pieces by row (25x100000) and saved in different locations/workers as .mat files. If I have a list of these .mat files, is there a way to create a distributed or codistributed array from these files?
I've already tried creating a fileDatastore and converting the results into a distributed, but this just gets me a datastore containing multiple cells. Is there a function similar to cellUnderlying() for distributed arrays? I'm using Matlab R2017a.

Respuestas (1)

Oli Tissot
Oli Tissot el 13 de Ag. de 2020
The following should do what you want:
ds = datastore('A_rowchunk_*.mat', 'Type', 'file', 'ReadFcn', @importdata, 'UniformRead', true);
dA = distributed(ds);
[U, S, V] = svd(dA);
For this to work the files must be located in a (network) location accessible from all the workers, but this seems to already be the case for you.

Categorías

Más información sobre Big Data Processing en Help Center y File Exchange.

Productos

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by