Running parfor on SLURM limits cores to 1
16 views (last 30 days)
Show older comments
Adam Shaw on 21 Aug 2022
Commented: Adam Shaw on 22 Aug 2022
Hello, I'm trying to run some parallelized code (through parfor) on a university high performance cluster. In order to make sure parallelization is working correctly, I set up a single node with 32 cores via "srun --pty -t 00:30:00 -n 32 -N 1 /bin/bash -l", which I verify does start an interactive session with 32 cores assigned. I then can start matlab from the command line as normal
matlab -nodisplay -nosplash
But when I try to initialize the parallel pool I see
>>poolobj = parpool(32);
Starting parallel pool (parpool) using the 'local' profile ...
Error using parpool (line 151)
You requested a minimum of 32 workers, but the cluster "local" has the
NumWorkers property set to allow a maximum of 1 workers. To run a communicating
job on more workers than this (up to a maximum of 512 for the Local cluster), increase the value of
the NumWorkers property for the cluster. The default value of NumWorkers for a Local cluster is
the number of physical cores on the local machine.
Checking the number of cores shows only a single core has been assigned.
MATLAB detected: 32 physical cores.
MATLAB detected: 32 logical cores.
MATLAB was assigned: 1 logical cores by the OS.
MATLAB is using: 1 logical cores.
MATLAB is not using all logical cores because Operating System restricted the number of cores to: 1.
I reached out to the cluster administrator, but they suggested the problem was on MATLAB's side, specifically perhaps needing to change some configuration to allow more than 1 core to be used, but I am not sure exactly how to do so - I see no way to edit for instance the parallel settings via the command line. I'm not familiar with running parallel MATLAB on non-local resources, so would appreciate any insight on how I could resolve these issues, or if there is a better way to setup/submit such jobs.
Raymond Norris on 22 Aug 2022
-n is tasks, but with 1 CPU per task (by default). It's possible that cgroups is telling MATLAB it only has 1 CPU. -c is CPUs per task. Setting this to 32 might tell MATLAB it's been assigned 32 cores, allowing a local pool of 32 workers.
More Answers (0)
Find more on Parallel Computing Fundamentals in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!Start Hunting!