GPU computation freezes randomly on Windows 10
Mostrar comentarios más antiguos
I'm experencing a strange problem using GPU computation on a Windows 10 machine.
The function which causes the problem is a simple random walk called with arrayfun() for computation on the gpu. So nothing fancy there. Since it is only adding up the position with a random step for a certain amount of timesteps it cannot get stuck in theory.
The exact same code runs perfectly fine on Windows 7 and Windows 8.1 on the same machine using a GTX 1070 using the TdrLevel 0 registry entry. I tried several different driver versions on Windows 10 but after some random time the computation freezes. The GPU load remains at 100% but the Powerconsumption goes down from 45% to 25% and remains there forever. There is also no monitor connected to this GPU.
Sometimes I can trigger this freeze by opening the Taskmanager or GPU-Z so it seams that if something tries to get information from the GPU it freezes.
How can I debug the reason for this freeze when using arrayfun? Because when it freezes I cannot use CTRL+c to stop the computation in Matlab. I have to kill the matlab task. There is also no error in the Command Window.
Many thanks in advance, Dominik
8 comentarios
Joss Knight
el 1 de Nov. de 2017
Are you sure you're using the correct device? Try
for i = 1:gpuDeviceCount
gpuDevice
end
Cedric
el 1 de Nov. de 2017
Also, somtimes there is that going on:
Joss Knight
el 2 de Nov. de 2017
I'll admit that the behaviour of Windows GPUs in WDDM mode often defies explanation, but what you have here is a graphics card with timeouts disabled running a long-running kernel and causing your graphics to become suspended. Logically, your GPU is doing some graphics. If this were a laptop it would be easy to explain.
It would be helpful to know what hardware you have and how it is configured. Can you run nvidia-smi and tell me what it says?
Dominik Ludwig
el 3 de Nov. de 2017
Editada: Dominik Ludwig
el 3 de Nov. de 2017
Dominik Ludwig
el 3 de Nov. de 2017
Joss Knight
el 5 de Nov. de 2017
I'm afraid you've gone beyond my area of expertise. It would appear you need to talk to NVIDIA, since this would appear to be a hardware configuration issue. (Or perhaps you'll find someone more useful than me on this forum of course...)
Joss Knight
el 5 de Nov. de 2017
In answer to your original question, you can't debug an arrayfun kernel in MATLAB, because it's not MATLAB code that's executing but a GPU kernel compiled from that code. But you can try attaching a CUDA debugger or analysing behaviour in one of the CUDA tools, like the Visual Profiler. The profiler can tell you quite a lot about running kernels.
Dominik Ludwig
el 8 de Nov. de 2017
Respuestas (0)
Categorías
Más información sobre Parallel Computing Toolbox en Centro de ayuda y File Exchange.
Productos
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!