parfor load balancing chunksize
1 view (last 30 days)
I have function that uses a parfor (~100 iteration) evaluating another function. However one of the workers is two times faster than the other two (it uses a GPU that is two times faster, than the ones used by the other workers). Suddenly the usage of worker one (the fast one) stops, while the other ones are still calculating a lot of iterations (say 5-10 each). I suspect that the worker one is out of available chunks of the parfor load balancing whilst the other ones are still busy with one of the larger chunks.
Is there a way to change the maximal chunksize to for instance 2 or 3 such that the problem of unexploited resources is circumvented?
Edric Ellis on 22 Mar 2018
parfor offers no means of controlling the chunk size. parfeval allows full control over how you split work up - perhaps you can use that instead (unfortunately, this will require a bit of restructuring of your code).
More Answers (1)
William Smith on 26 Mar 2018
Edited: William Smith on 26 Mar 2018
This issue of heterogenous workloads (or in your case heterogenous workers) and parfor's lack of proper support for them has come up a number of times over the years.
I solved this in my own domain because I know roughly how long it will process each piece of work, in advance. I then used the 'Longest Processing Time' scheduling algorithm, described at https://en.wikipedia.org/wiki/Multiprocessor_scheduling , to preallocate an array of length parpool().NumWorkers . Each element of the array has multiple pieces of work, which is what my scheduling algorithm optimises.