How do I allocate cpu resources to a batch job?

15 visualizaciones (últimos 30 días)
sebrz
sebrz el 3 de Jun. de 2022
Editada: sebrz el 8 de Jun. de 2022
I want to run several batch jobs in the background. Each job runs a different script, but each script calls a 'mpi -np 16 "someApplication"' in parallel with 16 physical cpu cores. These 16 cores are fixed.
Do I need to set up a pool of 16+1 workers for each batch job or do I set up one worker with 16 cpu s, or what would be the best solution in order to run multiple jobs in the background for a server of several multiples of 16 cpus? Can a worker acces multiple cpu s if it is necessary?
Thanks in advance?

Respuestas (1)

Raymond Norris
Raymond Norris el 3 de Jun. de 2022
It would help if you could provide an example of the script and how you're running the job.
Let's assume you're using PBS, maybe it looks something like
#!/bin/sh
#PBS -l ... (request nodes, ppn, etc.)
module load matlab
mpirun -np 16 matlab -r someApplication
* You wrote mpi -np but I'm assuming you meant mpiexec/mpirun. mpirun should be smart enough to not even need to specify -np 16.
Allocating 16 cores to MATLAB means MATLAB will see 16 cores, but it doesn't by default start a "pool" of workers. Rather, MATLAB will leverage it for the implicit multi-threading (e.g., fft).
If you start a local pool, you'd want to keep it to a max of 16 (which is really 17 including the MATLAB client). In this case, the workers will start MATLAB in singlethreaded mode by default. A worker can access multiple CPUs if you tell the pool to start with more threads. For example
local = parcluster("local");
local.NumThreads = 2;
pool = local.parpool(8);
Again, if you can provide a sample batch script and highlevel MATLAB code, it'll be easier to guide you.
  1 comentario
sebrz
sebrz el 8 de Jun. de 2022
Editada: sebrz el 8 de Jun. de 2022
Hi Raymond,
thanks for your answer and the clarification about pool and workers; and yes I meant mpirun.
So the matlab script looks like this right now:
% myScript.m
% set up
x0 = somevalue;
b.a = anothervalue;
...
xmin, fmin = optimizer('someFunction',x0,b)
and the function is defined in the same directory as myScript.m and calls an external application/module.
% some function is defined in same directory as myScript.m
[f] = someFunction(a,b);
doStuffInDirectory;
f = system('mpirun -np 16 externalApplication')
Let's say I want to do it with slurm and a node has 48 cpus.
For the first szenario I have different scripts which call different optimizers/have different objectives/constraints etc :
#!/bin/bash
...
#SBATCH --nodes=1
#SBATCH --tasks-per-node=3
#SBATCH --cpus-per-task=16
$MCRMODULE = MATLAB
module rm matlab
module load $MCRMODULE
module load externalApplication
matlab -nodisplay -singleCompThread -r "myScript;"
Do I run the different batch jobs on one slurm script or do I make multiple slurm scripts?
matlab -nodisplay -singleCompThread -r "myScript1;"
matlab -nodisplay -singleCompThread -r "myScript2;"
Can I just write a MATLAB script like this and with one slurm script call all jobs? Like this:
%myBatchScript
job1 = batch('myScript1')
job2 = batch('myScript2')
...
I have seen on my pc if I run a batch like this several times on the command window:
job = batch('myScript')
it works without problems and I do not have to set pool/workers.
But since I move to a bigger cluster I was wondering what would be the best option ?
In the second szenarion, I want to evaluate in parallel multiple optimizations with different inputs but same optimizer/constraints/ etc.
I do not have the "someFunction" written in such a way that I can evaluate it in parallel while calling the external application, because it writes stuff in the directory unfortunately.
Ideally, my MATLAB code would look like this:
% myIdealScript
param = [NxM];
for i = 1:size(param,2)
f = myScriptAsAFunction(param)
end
and be evaluated with one slurm script. Do I just set 16xN/48 as nodes and leave 3 as tasks per node?
matlab -nodisplay -singleCompThread -r "myIdealScript;"

Iniciar sesión para comentar.

Categorías

Más información sobre Cluster Configuration en Help Center y File Exchange.

Productos


Versión

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by