Parfor HPC Cluster - How to Assign Objects to Same Core Consistently?
15 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
Hello,
TLDR: Is there a way to force Matlab to consistently assign a classdef object to the same core? With a parfor loop inside another loop?
Details:
I'm working on a fairly complex/large scale project which involves a large number of classdef objects & a 3D simulation. I'm running on an HPC cluster using the Slurm scheduler.
The 3D simulation has to run in a serial triple loop (at least for now; that's not the bottleneck).
The bottleneck is the array of objects, each of which stores its own state & calls ode15s once per iteration. These are all independent so I want to run this part in a parfor loop, and this step takes much longer than the triple loop right now.
I'm running on a small test chunk within the 3D space, with about 1200 independent objects. Ultimately this will need to scale about 100x to 150,000 objects, so I need to make this as efficient as possible.
It looks like Matlab is smartly assigning the same object to the same core for the first ~704 objects, but then after that it randomly toggles between 2 cores & a few others:
This shows ~20 loops (loop iterations going downwards), with ~1200 class objects on the X axis; the colors represent the core/task assignment on each iteration using this to create this matrix:
task = getCurrentTask();
coreID(ti, ci) = task.ID;
This plot was created assigning the objects in a parfor loop, but that didn't help:

The basic structure of the code is this:
% pseudocode:
n_objects = 1200; % this needs to scale up to ~150,000 (so ~100x)
for i:n_objects
object_array(i) = constructor();
% also tried doing this as parfor but didn't help
end
% ... other setup code...
% Big Loop:
dt = 1; % seconds
n_timesteps = 10000;
for i = 1:n_timesteps
% unavoidable 3D triple loop update
update3D(dt);
parfor j = 1:n_objects
% each object depends on 1 scalar from the 3D matrix
object_array(i).update_ODEs(dt); % each object calls ode15s independently
end
% update 3D matrix with 1 scalar from each ODE object
end
I've tried adding more RAM per core, but for some reason, it still seems to break after the 704th core, which is interesting.
And doing the object initialization/constructors inside a parfor loop made the initial core assignments less consistent (top row of plot).
Anyway, thank you for your help & please let me know if you have any ideas!
I'm also curious if there's a way to make the "Big Loop" the parfor loop, and make a "serial critical section" or something for the 3D part? Or some other hack like that?
Thank you!
ETA 7/28/25: Updated pseudocode with dt & scalar values passing between 3D simulation & ODE objects
3 comentarios
Respuestas (1)
Edric Ellis
el 29 de Jul. de 2025
I think this might be a case for spmd. With spmd, you can ensure you construct the objects on particular workers, and only ever operate on them there. The following code assumes you can divide the number of objects evenly (if you can't, you'll need to do a bit more bookkeeping).
spmd
% Construct objects direct on the workers
n_per_worker = n_objects / spmdSize;
for i = 1:n_per_worker
object_array(i) = constructor();
end
% Big loop
dt = 1; % seconds
n_timesteps = 10000;
for i = 1:n_timesteps
update3D(dt); % Not sure what this needs to modify...
for j = 1:n_per_worker
object_array(j).update_ODEs(dt);
% Extract the scalar from each object
scalar_per_obj(j) = object_array(j).get_scalar();
end
% Get all the scalars across all workers
all_scalars = spmdCat(scalar_per_obj);
% Do something with all_scalars...
end
end
In this sketch, each worker constructs a vector of objects, and then operates on them independently. The spmdCat is an example showing how all workers can get all the scalar values, which I'm assuming they need to proceed to the next timestep. If you wish, you could have that piece run on only one worker by doing something more like this:
% call spmdCat, with result only on worker 1
dim = 1; % concatenation dimension
destination = 1;
all_scalars = spmdCat(scalar_per_obj, dim, destination);
if spmdIndex == destination
result = sum(all_scalars.^2);
% send result to all workers
spmdBroadcast(destination, result);
else
% Get result from "destination"
result = spmdBroadcast(destination);
end
1 comentario
Ver también
Categorías
Más información sobre Parallel for-Loops (parfor) en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!