- Large enough that the computation time is large compared to the overhead of scheduling the subrange
- Small enough that there are enough subranges to keep all workers busy
How to prevent parfor from slowing down towards later iterations?
14 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
Felix
el 6 de Mayo de 2025
Comentada: Felix
el 7 de Mayo de 2025
Hi there,
I'm running lots of simulation on an HPC in a parfor loop. The number of iterations is around 150000. I propably want to do more than that at a later time.
My scripts and functions print out a progress update into a text log after each iteration and I notice that the main parfor loop I'm utilizing slows down dramatically towards the end. The cpu time elapsed per iteration does not change apart from slight variations, leading me to believe that towards the end, fewer workers are utilized.
Does matlab assign all of the work and iterations to each worker once at the start of the parfor loop and then keep workers idle which are finished quicker than the others? It seems like a couple unfortunate workers are stuck finishing their allocated iterations, while the rest idles?
Is there a way to change this?
0 comentarios
Respuesta aceptada
Damian Pietrus
el 6 de Mayo de 2025
Out of curiosity, how many workers do you have in your pool? Is it a reasonable multiple of the total number of iterations in your parfor?
When assigning parfor iterations to workers in the pool, your client MATLAB session first sends small chunks of tasks to get started, sends larger chunks in the middle of the processing time, then finally sends smaller chunks to try to have tasks finish reasonably close to one another. However, this automatic process may sometimes result in certain workers wrapping up early and waiting for others to finish. You can see an extemely simplified version in the code below. Since neither 17 or 19 are even multiples of the 4 workers in the pool, the overall execution time will be the same for both loops.
if isempty(gcp('nocreate'))
parpool('Processes', 4);
end
% Start the first timer for parallel execution
tic
nIterations = 17;
parfor i = 1:nIterations
% Simulate some work with a pause
pause(0.50);
end
toc
% Start the second timer for parallel execution
tic
nIterations = 19;
parfor i = 1:nIterations
% Simulate some work with a pause
pause(0.50);
end
toc
One potential workaround would be to use parforOptions to manually control the range partitioning. To quote the doc:
You can control how parfor divides iterations into subranges for the workers with parforOptions. Controlling the range partitioning can optimize performance of a parfor-loop. For best performance, try to split into subranges that are:
This would allow you to manually choose how many iterations are being sent to each worker, potentially leveling out execution time at the end of your loop. If you do give it a try, let us know how it goes!
Más respuestas (1)
Matt J
el 6 de Mayo de 2025
Editada: Matt J
el 6 de Mayo de 2025
Is there a way to change this?
I suspect not. Firstly, if you have reduction variables, the workers that aren't iterating aren't really "idle". They need to store their accumulated portion of the reduction variables until they can be combined with the portion from other workers.
Also, it can be counter-productive for one worker to broadcast part of its tasks to an idle worker midstream. In particular, any temporary variables residing on the non-idle worker would have to be cloned and broadcast. This will have overhead, especially if the temporary variables are large. At the very least, a complex decision would need to be made as to when these extra broadcasting steps are worth it.
Ver también
Categorías
Más información sobre Parallel Computing Fundamentals en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!