Parpool Initialization Stuck in Timeout While Loop

13 visualizaciones (últimos 30 días)
Jeremy
Jeremy el 16 de Sept. de 2022
Comentada: Edric Ellis el 20 de Sept. de 2022
I was able to run Matlab parfor commands without issue one day ago. Today, Matlab gets stuck on "Starting parallel pool (parpool) using the 'local' profile ..." I've tried searching the Matlab functions used when starting a parpool and found that it is getting stuck in an endless while loop in the below nested function within JavaBackedSession. Any troubleshooting suggestions or fixes for this issue would be appreciated. I've spent several hours looking at parfor issues online already and the four troubleshooting steps listed below did not work.
Troubleshooting that did not work:
  1. Entering distcomp.feature( 'LocalUseMpiexec', true )
  2. Entering distcomp.feature( 'LocalUseMpiexec', false)
  3. Deleting the local_cluster_jobs folder and restarting Matlab
  4. Entering poolobj = gcp('nocreate'); delete(poolobj);
C:\Software\Mathworks\Matlab_All_Products_R2021b\toolbox\parallel\cluster\+parallel\+internal\+pool\JavaBackedSession.m
function session = waitForSessionCreation(~, sessionFuture, connectionCounter, ...
checkFcn)
% Block until the session has been created - which completes only when all the
% connections are available.
gotSession = false;
session = [];
previouslyConnectedTo = 0;
while ~gotSession
% This throws an appropriate error in the case where things go wrong.
[gotSession, session] = parallel.internal.getJavaFutureResult(...
sessionFuture, 1, java.util.concurrent.TimeUnit.SECONDS);
if gotSession
return
end
% If we get here, we have no session. Let's check to see how things are getting
% on using the injected checkFcn - this might throw an error if things are bad.
checkFcn();
currentlyConnectedTo = double(connectionCounter.get());
if currentlyConnectedTo > previouslyConnectedTo
dctSchedulerMessage(2, 'Currently connected to: %d', currentlyConnectedTo);
previouslyConnectedTo = currentlyConnectedTo;
end
end
end
  2 comentarios
Jeremy
Jeremy el 16 de Sept. de 2022
5. I also tried using the "Validate" option in "Cluster Profile Manager," but the Validate function got stuck at "Job test (createJob).
Edric Ellis
Edric Ellis el 20 de Sept. de 2022
If validation got stuck at the "createJob" stage, then that might well mean that for some reason worker processes aren't launching correctly. I suggest contacting MathWorks support directly.

Iniciar sesión para comentar.

Respuestas (0)

Categorías

Más información sobre Parallel Computing Fundamentals en Help Center y File Exchange.

Etiquetas

Productos


Versión

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by