Not able to start parpool in multiple different matlab instances simultaneously in a single machine.

18 visualizaciones (últimos 30 días)
I tried in multiple different matlab instance.
for i = 1:2
str2Eval = [ '!matlab -r "myFunction(''' fileName(i) ''');exit;" &'];
eval(str2Eval);
% This is to create seperate matlab instance to run parallely.
% Inside "myFunction" I used "parfeval" for running some operations parallely.
end
% Now, both matlab instance opened, started working perfectly till parfeval
% and showed error while creating parallel pool.
% (This worked perfectly when previously ran in a single instance.)
% I closed ALL matlab instances, and opened new one.
% tried runing "parpool(2)", it does not work and gives the following error:
Starting parallel pool (parpool) using the 'local' profile ...
Error using parpool (line 145)
Parallel pool failed to start with the following error. For more detailed information, validate
the profile 'local' in the Cluster Profile Manager.
Caused by:
Error using parallel.internal.pool.InteractiveClient>iThrowWithCause (line 670)
Failed to locate and destroy old interactive jobs.
Error using parallel.Cluster/findJob (line 74)
Unknown type: concurrentconcurrent.
I restarted Windows 10. There is no "local_scheduler_data" or "local_cluster_jobs" in "prefdir". Tried to validate from "Cluster profile manager". All test passed except the last one: "Parallel pool test (parpool)". "distcomp.feature( 'LocalUseMpiexec', false )" didn't worked. "Administrator mode" didn't worked.
The college workstation have 32 cores and enough RAM to run my model in parallel. I am just tring to run some commands in parallel which are independent to each other.
  1. How to make "parpool" working again? (solved) by deleting "R2020a" folder inside "local_cluster_jobs" folder from parent directory of "prefdir".
  2. Is it possible to use parpool in multiple MatLab instances runing simultaneously? If yes, how?
  3 comentarios
Julian H
Julian H el 10 de Sept. de 2020
In my experience on 2017b, it is possible to start many parallel pools in different MATLAB instances, however, their creation time may not overlap. In other words, while one MATLAB instance is creating the pool, if another tries to create its own separate pool, it (or in some cases both) will fail. Delaying the creation by the time it takes to set up a pool (~10-30 seconds depending on the pool size and hardware) seems to help. However, this is purely anecdotal, I can't explain why this is the case.
jessupj
jessupj el 10 de Sept. de 2020
I think when I did this before (R2014 or thereabouts), I had to define differnet clusters so that the matlab instances (called from shell using GNU parallel) opened independent pools. I didn't try J.Herzog's delay tactic -- that never occurred to me to try.

Iniciar sesión para comentar.

Respuestas (2)

Moritz Schappler
Moritz Schappler el 5 de Nov. de 2020
I also encountered this problem for running multiple parallel instances on different nodes of a PBS computing cluster.
When running about 20 parallel instances (each running a parpool) and starting them all at the same time, this happens nearly always.
You can prevent simultaneous write access which crashes the prefdir by using some kind of synchronization.
I tried to do a simple implementation using a lockfile. Perhaps this is helpful. the command would look like this:
%% start pool with protection of the prefdir
parpool_writelock('lock', 180, true); % wait at most 3 minutes for other parpools starting simultaneously
parpool(Set.general.parcomp_maxworkers);
parpool_writelock('free', 0, true);
%% parallel computation
% ....
%% end pool with protection of the prefdir (not sure if this is necessary)
fprintf(fid, 'parpool_writelock(''lock'', 300, true);\n'); % wait 5 minutes for other parpools to start/stop
fprintf(fid, 'delete(gcp(''nocreate''));\n');
fprintf(fid, 'parpool_writelock(''free'', 0, true);\n');

Moritz Schappler
Moritz Schappler el 14 de Oct. de 2021
Another solution to the problem may be to change the home directory environment variable before starting the parallel instances of Matlab. Depending on the system (Windows/Linux) and configuration (local machine/several cluster nodes), the commands may be different. This is bash code for running Matlab on a computing cluster:
export HOME=$TMPDIR
matlab -nodesktop < script.m > $LOGFILE 2>&1
Every parallel cluster node gets its own temporary directory ("$TMPDIR") in the form "/scratch/7782473.batch.css.lan". If the variable TMPDIR ist not defined, a unique temporary directory should be generated. This directory is deleted at the end of the session in my case and only contains a java.log file. When starting Matlab with the second command, the profile directory is not /home/username/.matlab/R2021a anymore which was accessed in parallel before and caused the file access problems in my case.
I also included this in my scripting environment to upload parallel computing jobs on a PBS cluster.

Categorías

Más información sobre Parallel Computing Fundamentals en Help Center y File Exchange.

Productos

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by