Parallel computing, occasionally get Exception message "Message Catalog MATLAB:load was not loaded from the file"
24 views (last 30 days)
I am running two jobs on a cluster, job1 on node1, job2 on node2. Job1 starts a little bit earlier than job2.
Everything is fine for job1. But for job2, sometimes I get the exception message in command line, " Caught "std::exception" Exception message is:
Message Catalog MATLAB:load was not loaded from the file. Please check file location, format or contents ".
When this happens, my job did not stop but it did not do calculation anymore, i.e. it hangs.
I suspect this is due to the following resons:
- This is related to the load() function. Actually, I did use load() in my parfor-loop. However, I thought load() is different from fopen(), which needs to be followed by fclose(). So, do I have to take some actions when using load() in parlor-loop?
- This is related to linux system. When there are too many open files, this may occur. However, I did not open any file in my parfor-loop.
- This is related to linux system and I used too much resources. When I run only a job, this exception message never shows.
Did someone come into this?
Edric Ellis on 11 Dec 2020
Edited: Edric Ellis on 11 Dec 2020
The probable cause of this is the file handle limit. This page: https://www.mathworks.com/help/parallel-computing/recommended-system-limits-for-machintosh-and-linux.html has some instructions. Basically, I think you need to raise the ulimit values on the system.
(The other thing to check is that you aren't opening lots of file handles using fopen and not subsequently fcloseing them)