- Which platform is MATLAB Parallel Server running on, Linux or Windows?
- Which scheduler are you using (MJS, PBS, etc.)?
- What size pool are you running?
- How many cores per node?
- How much RAM per node?
Unable to submit task result (Matlab parallel server)
3 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
Hi,
I am running some tests on a cluster. I create a job, and I submit several tasks. But, I get the following error
Error: Cannot rerun task because there are no rerun attempts left (The task has no rerun attempts left.).
Original cancel message:
java.lang.Exception: Unable to submit task result - MATLAB will now exit and restart.
Where shall I start to look at? What does practically this error mean? Is it a problem on the client side, or on the cluster side?
0 comentarios
Respuestas (1)
Raymond Norris
el 2 de Dic. de 2021
Hi Maria,
A few questions first:
If you're running non-MJS, try the following. I'll show using both batch and parpool.
setenv('MDCE_DEBUG','true')
cluster = parcluster;
% If you're using batch
job = cluster.batch();
job.wait
cluster.getDebug(job)
% If you're using parpool
pctconfig('preservejobs',true);
pool = cluster.parpool();
cluster.getDebug(cluster.Jobs(end))
If you're using MJS
mjs = parcluster;
mjs.ClusterLogLevel = 4;
% Call either batch or parpool
mjs.getClusterLogs()
Perhaps the log file will display something else. If I had to guess, I'm betting you're running out of memory.
0 comentarios
Ver también
Categorías
Más información sobre Parallel Computing Fundamentals en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!