Segmentation Violation and Memory Assertion Failure

I am running multiple Matlab jobs in parallel on an Sun Grid Engine that is using Matlab 2016b. On my personal macbook I am running Matlab 2016a. The script is doing some MRI image processing, where each job uses a different set of parameters so that I can do parameter optimization for my image processing routine. About half of the jobs crash however, either due to segmentation violations, memory assertion failures ('You may have modified memory not owned by you.') or errors from HDF5-DIAG followed by a segmentation violation.
Some observations
  1. The errors do not always occur in the same jobs or in the same functions.
  2. I am not using dynamic arrays anymore but preallocate my arrays. If the arrays turn out to be too small I extend them with for example cat(array, zeros(1, 2000)).
  3. The jobs use partly the same computations so they can share data. I do this by first checking wether the data is already generated by another job. If so try to load it using a while loop with a maximum number of attempts and pauses of 1 second (since it might fail when another job is still writing to the file, if it waits a bit and retries it might succeed). If the loading fails after the maximum number of attempts or if the data does not exist yet, then this job performs the required computations and tries to save the data. If the data was saved by another job in the meantime then this job does not save the data anymore.
  4. I am not using any C/C++ or MEX files.

Respuestas (0)

Categorías

Preguntada:

el 2 de Jun. de 2017

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by