Help with parcluster, createJob, and createTask
3 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
Sossena Wood
el 4 de Ag. de 2014
Comentada: Thomas Ibbotson
el 7 de Ag. de 2014
I've recently changed our code to no longer use matlabpool but parcluster, createJob and createTask. Things are working well on matlab for our node with 16 cores. However, we experience the code crashing on our newer nodes with 32 cores. The code will run for 30 minutes and when its time for the code to distribute onto 32 cores, it crashes. To verify the cores I get the following from our cluster:
MATLAB detected: 32 physical cores. MATLAB detected: 32 logical cores. MATLAB was assigned: 32 logical cores by the OS. MATLAB is using: 32 logical cores.
The error is with parallel submit:
I get the following message from output_8_2_14.txt:
{Error using parallel.Job/submit (line 304) Java exception occurred: java.lang.OutOfMemoryError: unable to create new native thread at java.lang.Thread.start0(Native Method) at java.lang.Thread.start(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.addWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.execute(Unknown Source) at java.util.concurrent.AbstractExecutorService.submit(Unknown Source) at com.mathworks.toolbox.distcomp.local.LocalScheduler.submitInOrder(LocalScheduler.java:143) at com.mathworks.toolbox.distcomp.local.LocalScheduler.submit(LocalScheduler.java:138) at com.mathworks.toolbox.distcomp.local.AbstractLocalCommand.submit(AbstractLocalCommand.java:172)
Error in optimizeCall_v208 (line 192) submit(j);
Error in optimization_BF_v3 (line 414) [x1, fval, h, fid]=optimizeCall_v208(bField_ROI,bField_NEG,h,fid); %#ok<NOPRT> } >> MATLAB: runtime/shutdown.cpp:168: bool mnShutdownMatlabInternal(bool, bool, const boost::optional<int>&, int*, bool, bool): Assertion `Unexpected exception during MATLAB shutdown: boost::thread_resource_error' failed.
------------------------------------------------------------------------ Assertion detected at Sat Aug 2 09:55:00 2014 ------------------------------------------------------------------------
Configuration: Crash Decoding : Disabled Current Visual : None Default Encoding : UTF-8 GNU C Library : 2.14 stable MATLAB Architecture: glnxa64 MATLAB Root : /usr/local/MATLAB/R2014a MATLAB Version : 8.3.0.532 (R2014a) Operating System : Linux 2.6.40.3-0.fc15.x86_64 #1 SMP Tue Aug 16 04:10:59 UTC 2011 x86_64 Processor ID : x86 Family 31 Model 9 Stepping 1, AuthenticAMD Virtual Machine : Java 1.7.0_11-b21 with Oracle Corporation Java HotSpot™ 64-Bit Server VM mixed mode Window System : No active display
Fault Count: 1
Assertion in bool mnShutdownMatlabInternal(bool, bool, const boost::optional<int>&, int*, bool, bool) at runtime/shutdown.cpp line 168: Unexpected exception during MATLAB shutdown: boost::thread_resource_error
Register State (captured): RAX = 00007f3a3f87d900 RBX = 00007f3a3f87df00 RCX = 0000000000000012 RDX = 00007f3a57586df8 RSP = 00007f3a3f87d710 RBP = 00007f3a3f87dad0 RSI = 0000000000000001 RDI = 00007f3a3f87d720
R8 = 0000000000000000 R9 = 000000000000a0c4 R10 = 0000014200000001 R11 = 00007f39d700f360 R12 = 00007f3a575a7d20 R13 = 00007f3a563bc24e R14 = 00007f3a563bc320 R15 = 00007f3a3f87e740
RIP = 00007f3a5729a4ee EFL = 0000000000000003
matlab_crash_dump.41110-1.txt states:
------------------------------------------------------------------------ Assertion detected at Sat Aug 2 09:55:00 2014 ------------------------------------------------------------------------
Configuration: Crash Decoding : Disabled Current Visual : None Default Encoding : UTF-8 GNU C Library : 2.14 stable MATLAB Architecture: glnxa64 MATLAB Root : /usr/local/MATLAB/R2014a MATLAB Version : 8.3.0.532 (R2014a) Operating System : Linux 2.6.40.3-0.fc15.x86_64 #1 SMP Tue Aug 16 04:10:59 UTC 2011 x86_64 Processor ID : x86 Family 31 Model 9 Stepping 1, AuthenticAMD Virtual Machine : Java 1.7.0_11-b21 with Oracle Corporation Java HotSpot™ 64-Bit Server VM mixed mode Window System : No active display
Fault Count: 1
Assertion in bool mnShutdownMatlabInternal(bool, bool, const boost::optional<int>&, int*, bool, bool) at runtime/shutdown.cpp line 168: Unexpected exception during MATLAB shutdown: boost::thread_resource_error
Register State (captured): RAX = 00007f3a3f87d900 RBX = 00007f3a3f87df00 RCX = 0000000000000012 RDX = 00007f3a57586df8 RSP = 00007f3a3f87d710 RBP = 00007f3a3f87dad0 RSI = 0000000000000001 RDI = 00007f3a3f87d720
R8 = 0000000000000000 R9 = 000000000000a0c4 R10 = 0000014200000001 R11 = 00007f39d700f360 R12 = 00007f3a575a7d20 R13 = 00007f3a563bc24e R14 = 00007f3a563bc320 R15 = 00007f3a3f87e740
RIP = 00007f3a5729a4ee EFL = 0000000000000003
CS = 0003 FS = 0000 GS = 0000
Stack Trace (captured): [ 0] 0x00007f3a5729a4ee /usr/local/MATLAB/R2014a/bin/glnxa64/libmwfl.so+00972014 _ZN2fl4diag5linux12context_base12capture_dataEv+00000030
If this problem is reproducible, please submit a Service Request via: http://www.mathworks.com/support/contact_us/
A technical support engineer might contact you with further information.
Thank you for your help.
0 comentarios
Respuesta aceptada
Thomas Ibbotson
el 4 de Ag. de 2014
It looks like the number of processes you can run in your user account is limited. You can find out what it is set to by running the following command in a terminal:
ulimit -a
which will give you output like:
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 256447
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 128000
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 500000
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
You may need to edit /etc/security/limits.conf to increase 'max user processes' this if it is too low.
3 comentarios
Thomas Ibbotson
el 7 de Ag. de 2014
Add a line like this:
username hard nproc 100000
but replace 'username' with your username. If you want it to be set for a group then use '@groupname', or for all users use '*'
Más respuestas (0)
Ver también
Categorías
Más información sobre Startup and Shutdown en Help Center y File Exchange.
Productos
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!