How are gpuArrays handled inside parfor?

19 visualizaciones (últimos 30 días)
Garrett Good
Garrett Good el 27 de Nov. de 2017
Editada: Joss Knight el 27 de Nov. de 2017
I've been going through various posts and am still a little unsure about how gpuArrays and functions behave inside parfor loops.
FYI What I'm not doing is trying to use multiple GPUs or GPU workers as matlab workers. In my current application, each worker is running an iteration of an optimization algorithm, with the parfor code mostly executing the cost function.
An expensive part of a parfor has some large matrix multiplications and interpolations, and I know this runs much faster on the GPU. Can multiple workers access a single GPU simultaneously (up until they bottleneck the gpu memory), or does this get serialized so that there's no benefit, even if a single iteration doesn't fully use the GPU?
On that note, can a constant gpuArray (or 'object-wrapped' gpuArray?) be read simultaneously, or will each worker make its own copy on the gpu so that the worker can alter it?
Many thanks in advance for your expertise!

Respuestas (1)

Joss Knight
Joss Knight el 27 de Nov. de 2017
Editada: Joss Knight el 27 de Nov. de 2017
Yes, they can all use the same GPU. By default, anything you run on the same GPU from different processes will run in serial. However, if you are also doing a lot of host-side code, the other workers can be getting on with that while they take turns with the GPU, so you can still get a benefit. Just be wary of how much memory you are using. By default, each MATLAB process will hog up to a quarter of GPU memory. If you have four or more workers and you're using a lot of memory, you could find your GPU running out.
If you are on Linux, you can run the NVIDIA Multi Process Service to allow each process to use the GPU concurrently. However, this often doesn't gain you much, because code that is using the GPU 'well' will not have any spare compute for another process. A bit like multi-threading on a single core CPU, the apparent concurrency is still bottlenecked by the fact that there's actually only one processor.

Categorías

Más información sobre GPU Computing en Help Center y File Exchange.

Etiquetas

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by