Why does parallel.pool.const create a copy of the variable in memory for each worker sequentially instead of in parallel?

3 visualizaciones (últimos 30 días)
When creating a parallel.pool.const on 9 workers prior to using parfor, I noticed that the memory usage ramps up in 9 successive steps instead of all at once. The attached image shows these steps in memory usage prior to entering the parfor using 'Resource Monitor' on Windows 7. This seems to mean that the copies for each worker are created sequentially instead of in parallel and this takes alot of time. Why are these copies not created in parallel for faster execution? I am running R2017a.

Respuesta aceptada

Edric Ellis
Edric Ellis el 30 de Jun. de 2017
I suspect you're creating the parallel.pool.Constant using data created on the client. It's much more efficient to have the workers create the data, if possible. Consider two cases:
% Case 1: data created on the client
parallel.pool.Constant(ones(1e4));
% Case 2: use the Constant constructor with a function handle to create
% the contents directly on the worker
parallel.pool.Constant(@() ones(1e4));
This results in the following memory usage pattern. In the screen-shot, case 1 is indicated with a red arrow, and case 2 with a green arrow.
As you can see, case 2 happens in parallel, and avoids the data transfer from the client to the workers (it's the data transfer that really causes the lack of parallelism).
If you really cannot create the data on the workers, you can use the parallel.pool.Constant constructor that accepts a Composite, like this:
% Build an empty Composite
c = Composite();
% Transfer the data from client only to worker 1
c{1} = ones(1e4);
c(2:end) = {[]};
spmd
% Use labBroadcast to copy data to all workers (labBroadcast
% is more efficient than the client/worker communication)
c = labBroadcast(1, c);
end
% Build the Constant from the Composite
c = parallel.pool.Constant(c);
% Flush memory on the workers by executing an empty SPMD block
spmd, end
  1 comentario
Joseph Hall
Joseph Hall el 30 de Jun. de 2017
Thank you. Unfortunately, I am using real-world data and cannot have the data be created inside the workers, but are there other modes of data transfer that could be done in parallel such as accessing files on disk? I don't quite understand why data transfer cannot be done in parallel.

Iniciar sesión para comentar.

Más respuestas (0)

Categorías

Más información sobre Clusters and Clouds en Help Center y File Exchange.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by