The difference of NumThreads and NumWorkers in parpool, and thread-safety issue.
41 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
I'm trying to run my simulation on a 16 physical cores and 32 logical cores machine. I already understand that for computation-bounded task it is no much help to allocate two or more threads to each physical core. But if my task could really get benefits from hyperthreading, which method should I implement, in my case, using ①NumWorkers=32, NumThreads=1 or ②NumWorkers=16, NumThreads=2 ? Or they have no difference in performance?
The second question is the thread-safety issue when NumThread≥2. In my very simple case, I use built-in function rand frequently. For the similar code in C++ (also using rand in std), it causes performance problem when using multi-thread because the rand in std has mutex to ensure thread-safety, which cause different threads compete and wait for the rand invoke, incurring cpu-using-rate dropping.
I have no idea whether built-in rand has the similar mutex mechanism in matlab, but for multi-process (rather than multi-thread) it may cause no problem because each process uses its own context. However question comes when we set NumThreads = 2 or more. I wonder if it is the case that the multiple threads belonging to the same worker share their context, which means they compete for global resources too, just like the situation in C++ I mention above?
Thank for your answer in advance!
0 comentarios
Respuestas (1)
Ashutosh Thakur
el 18 de Jun. de 2024
Hello HeRan,
The optimal mix of workers and threads greatly depends on the specifics of your use case. Determining the right combination often requires a trial-and-error approach. However, between the two options presented, the first configuration (NumWorkers=32, NumThreads=1) is generally more effective when the tasks involved do not necessitate significant synchronization or access to shared resources. On the other hand, the second configuration (NumWorkers=16, NumThreads=2) may be advantageous when your process involves shared resources, and having multiple threads can be beneficial, particularly if one thread is waiting or blocked.
Several factors are involved in improving the performance, including the number of cores and the amount of RAM available per worker etc. For further insights into these parameters, you might find the following MATLAB answer helpful: https://www.mathworks.com/matlabcentral/answers/27749-cores-vs-speed-tradeoff-for-a-matlab-computer?s_tid=srchtitle.
It can be challenging to make definitive statements about the internal workings of MATLAB's built-in rand function without direct experimentation. A practical approach to understanding its behavior under different conditions is to experiment with varying numbers of threads. By analyzing the outcomes, you can more accurately determine the most effective configuration for your usecase.
Tools such as MATLAB Profiler and Simulink Profiler can be used to measure the performance and decide on the correct combination. https://www.mathworks.com/help/matlab/matlab_prog/profiling-for-improving-performance.html.
I hope this answers you query.
0 comentarios
Ver también
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!