parfor results core dump
Mostrar comentarios más antiguos
MATLAB R2015b crashes on 64 bit Linux. I managed to track the cause of failure to this chunk of code
values = cell(1, nElement);
parfor i = 1:nElement
values{i} = element(i).stiffnessMatrix;
end
where element is an object array and I call the stiffnessMatrix method on all of the objects. In practice nElement is very big. The segfault comes from libtbbmalloc.so.2 (now I know that parfor wraps TBB).
Reading on forums, I preloaded the problematic libraries:
export LD_PRELOAD=/usr/local/MATLAB/R2015b/bin/glnxa64/libtbb.so.2:/usr/local/MATLAB/R2015b/bin/glnxa64/libtbbmalloc.so.2
In vain. The same error persists. It seem that TBB cannot allocate memory (for smaller problem, parfor works). Why does it work for the simple for loop and not for the parfor loop? Just before the program crashed, I saw that I have 11 GB of free memory (more than 4 times more than used by the whole operating system).
I tested it on Windows, and it still fails with the same error. Then I tried with a simple for loop, not a parfor loop on both Linux and Windows. On Linux, the error persists but on Windows the for loop version works.
Thanks in advance.
10 comentarios
OCDER
el 16 de Nov. de 2017
If this works with a regular for loop, but not parfor, it's likely a concurrency issue. If element is an object and multiple workers are trying to split, modify, and reassemble an object, then maybe it crashes.
What is the object array element (what is the code that defines this object, or what does it store?)
What does element(i).stiffnessMatrix do to all these objects?
Do you have a simple full set of code for us to test that for loop?
Zoltán Csáti
el 17 de Nov. de 2017
OCDER
el 17 de Nov. de 2017
I would suspect the 3rd party MEX functions has a bug that manifests only when processing large data. You may have to debug the MEX source code directly as segmentation faults indicate a pointer error in the C code. Finding this bug would be VERY hard, especially since the code works sometime and then fails randomly, and for parfor and for loops...
You could try to pinpoint the inputs that cause the crash by using a regular for loop and saving variables after each successful loop. Once it crashes, restart from the prior loop and see if the next loop crashes again. If you can replicate the crash conditions, then debugging just might be possible.
Zoltán Csáti
el 17 de Nov. de 2017
OCDER
el 17 de Nov. de 2017
Yes, something like that. For instance:
values = cell(1, nElement);
for i = 1:nElement
save('temp.mat');
values{i} = element(i).stiffnessMatrix;
mexFunction2(...)
mexFunction3(...)
end
If it crashes always on the 79th iteration, then you can load 'temp.mat' and run the remain codes in the loop line-by-line until you find the mex function causing the crash.
Zoltán Csáti
el 17 de Nov. de 2017
OCDER
el 17 de Nov. de 2017
Without using any mex functions, does the loop at least crash at the same iteration number?
Zoltán Csáti
el 17 de Nov. de 2017
OCDER
el 17 de Nov. de 2017
This does sound like one of the worst case scenario for debugging codes. You might have to contact the authors of the 3rd party MEX codes to find the memory allocation bug. Without looking at all the codes, it'll be hard to pinpoint the issue. This might help with debugging strategies:
Zoltán Csáti
el 18 de Nov. de 2017
Respuestas (0)
Categorías
Más información sobre Matrix Indexing en Centro de ayuda y File Exchange.
Productos
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!