Parfor overhead: local cores vs. cluster core

1 visualización (últimos 30 días)

Brandon el 19 de Mayo de 2021

0
Enlazar

Enlace directo a esta pregunta

https://es.mathworks.com/matlabcentral/answers/834533-parfor-overhead-local-cores-vs-cluster-core

Comentada: Edric Ellis el 20 de Mayo de 2021

I have a parfor loop that takes as inputs data from a very large cell array, where all elements of the cell array are eventually used over the loop This process takes about 150 seconds when computed on 20 local cores, but about 500 seconds when computed on 20 clustered cores (I have 100 on the cluster, for which I would like to use for scaling).

Two questions:

1) Is it safe to assume that this time difference is due to network communication latency?

2) If the answer to (1) is yes, then is there any way to send the data in the cell array in a more efficient way ? As a highly simplified example of what I currently have:

for model_it = 1:100

% some operations to create cell1, which is of length k.

parfor ih=1:k

temp=cell1{ih}

out = f(temp); % some operations done to temp

output_store{ih} = out;

end

% some operations that use output_store to create inputs to for cell1 on the next model_it

end

I do not believe parallel.pool.Constant is an option here because the data in cell1 changes every model iterations. Do I have other options for setting up this problem?