parpool failed to start

47 visualizaciones (últimos 30 días)
Chad Greene
Chad Greene el 7 de Sept. de 2022
Comentada: Chad Greene el 8 de Sept. de 2022
I'm trying to start a parallel pool in Matlab 2020b, but the pool repeatedly fails to start. Can anyone offer any guidance? Here are the messages I get:
VALIDATION REPORT
Profile: LocalProfile1
Scheduler Type: Local
Stage: Cluster connection test (parcluster)
Status: Passed
Start Time: Wed Sep 07 15:08:24 PDT 2022
Finish Time: Wed Sep 07 15:08:24 PDT 2022
Running Duration: 0 min 0 sec
Description:
Error Report:
Command Line Output:
Debug Log:
Stage: Job test (createJob)
Status: Passed
Start Time: Wed Sep 07 15:08:24 PDT 2022
Finish Time: Wed Sep 07 15:08:59 PDT 2022
Running Duration: 0 min 35 sec
Description:
Error Report:
Command Line Output:
Debug Log:
Stage: SPMD job test (createCommunicatingJob)
Status: Passed
Start Time: Wed Sep 07 15:08:59 PDT 2022
Finish Time: Wed Sep 07 15:10:11 PDT 2022
Running Duration: 1 min 12 sec
Description: Job ran with 64 workers.
Error Report:
Command Line Output:
Debug Log:
Stage: Pool job test (createCommunicatingJob)
Status: Passed
Start Time: Wed Sep 07 15:10:11 PDT 2022
Finish Time: Wed Sep 07 15:10:58 PDT 2022
Running Duration: 0 min 47 sec
Description: Job ran with 64 workers.
Error Report:
Command Line Output:
Debug Log:
Stage: Parallel pool test (parpool)
Status: Failed
Start Time: Wed Sep 07 15:10:58 PDT 2022
Finish Time: Wed Sep 07 15:16:02 PDT 2022
Running Duration: 5 min 5 sec
Description: Failed to initialize the interactive session.
Error Report: Failed to initialize the interactive session.
Caused by:
Error using parallel.internal.pool.InteractiveClient>iThrowIfBadParallelJobStatus (line 789)
The interactive communicating job failed with no message.
Command Line Output:
Debug Log:
Error using parpool (line 139)
Parallel pool failed to start with the following error. For more detailed information, validate the profile 'LocalProfile1' in the Cluster
Profile Manager.
Error using parallel.internal.pool.InteractiveClient>iThrowWithCause (line 678)
Failed to initialize the interactive session.
Error in parallel.internal.pool.InteractiveClient/start (line 376)
iThrowWithCause( 'parallel:convenience:FailedToInitializeInteractiveSession', err );
Error in parallel.internal.pool.AbstractClusterPool>iStartClient (line 826)
spmdInitialized = client.start(poolType , numWorkers, cluster, ...
Error in parallel.internal.pool.AbstractClusterPool.hBuildPool (line 596)
iStartClient(client, 'pool', cluster, guiMode, supportRestart, argsList);
Error in parallel.internal.types.ValidationStages>iOpenPoolForCluster (line 399)
aPool = parallel.internal.pool.AbstractClusterPool.hBuildPool('Cluster', cluster, 'NumWorkers', numWorkers);
Error in parallel.internal.types.ValidationStages>@()iOpenPoolForCluster(runInfo)
Error in parallel.internal.types.ValidationStages>iCallWithNoHotlinks (line 311)
[varargout{1:nargout}] = fcn();
Error in parallel.internal.types.ValidationStages>iRunParpoolStage (line 226)
[commandWindowOutput, aPool] = evalc(iWrapForEvalc(openPoolFcn));
Error in parallel.internal.types.ValidationStages/run (line 55)
[eventData, runInfo] = obj.RunFunction(obj, runInfo);
Error in parallel.internal.validator.Validator/runValidationSuite (line 191)
[eventData, stageRunInfo] = currentStage.run(stageRunInfo);
Error in parallel.internal.validator.Validator/validate (line 103)
status = obj.runValidationSuite(profileName, suite);
Error in parallel.internal.ui.AbstractValidationManager/validate (line 36)
obj.Validator.validate(profileName, validationSuite);
Error in parallel.internal.ui.ValidationManager.validateProfile (line 36)
parallel.internal.ui.ValidationManager.getOrCreateInstance().validate(profileName, suite);
Caused by:
Error using parallel.internal.pool.InteractiveClient>iThrowIfBadParallelJobStatus (line 789)
The interactive communicating job failed with no message.
  4 comentarios
Raymond Norris
Raymond Norris el 8 de Sept. de 2022
Let's try one thing to get some more diagnostics and then turn this over to Techinical Support (support@mathworks.com).
pctconfig('preservejobs',true);
local = parcluster("local");
pool = local.parpool(64);
<... wait for error ...>
local.getDebugLog(local.Jobs(end))
It would also be interesting to see if you have issues with <64 workers (i.e., try 32 and/or 48 workers).
Chad Greene
Chad Greene el 8 de Sept. de 2022
Hmm... the output message is empty, because once again "the interactive communicating job failed with no message".
I've tried with 64, 12, and 8 workers.
>> local = parcluster("local");
pool = local.parpool(64);
Starting parallel pool (parpool) using the 'local' profile ...
Error using parallel.Cluster/parpool (line 86)
Parallel pool failed to start with the following error. For more detailed information, validate the
profile 'local' in the Cluster Profile Manager.
Caused by:
Error using parallel.internal.pool.InteractiveClient>iThrowWithCause (line 678)
Failed to initialize the interactive session.
Error using parallel.internal.pool.InteractiveClient>iThrowIfBadParallelJobStatus (line 789)
The interactive communicating job failed with no message.
>> local.getDebugLog(local.Jobs(end))
LOG FILE OUTPUT:
>>

Iniciar sesión para comentar.

Respuestas (0)

Categorías

Más información sobre Parallel Computing Fundamentals en Help Center y File Exchange.

Etiquetas

Productos


Versión

R2020b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by