Crashing GPU driver consistently, is this normal?

5 visualizaciones (últimos 30 días)
David Parks
David Parks el 20 de Jun. de 2014
Comentada: Rushikesh Tade el 21 de Sept. de 2014
I can consistently crash my NVIDIA GTX 860M (CUDA 3.0) drivers with the following command (it crashes on 'clear g'):
>> g = randi(7000,7000,'gpuArray'); g*g; clear ans; clear g;
In fact I can crash it 6 ways to sunday just by fiddling around with it.
I can reset it with reset(...), noting that I need to perform some operation after it, else certain operations such as randi(7000,7000,'gpuArray') will fail the next time it's run.
>> reset(gpuDevice(1)); randi(1,1,'gpuArray'); clear ans;
Is this behavior to be expected? Are CUDA operations inherently "finicky"?
I tried to remove the desktop from the NVIDA gpu, using the NVIDIA control panel to default to the embedded GPU (though I'm not sure how to validate beyond the NVIDIA UI. Perhaps a reboot is necessary after such settings?
-----
Follow up detail. I just crashed it with this operations:
>> gg = gpuArray(g);
>> x = gg*gg;
>> y = x;
>> clear y
>> y = gather(x);
Error using gpuArray/gather
An unexpected error occurred during CUDA execution. The CUDA error was:
CUDA_ERROR_LAUNCH_TIMEOUT
That's making me wonder if these crashes are fundamentally occurring because of the timeout associated with a GPU that has a desktop on it.
Does anyone know how to validate that my GPU is free of any desktop influence? I want to be running the desktop on my embedded GPU and leave the NVIDA free for Matlab use.
Also, I notice that sometimes I've been successful at gathering a large array (say 7000x7000) from a gpuArray, and other times it takes longer than 2 seconds and crashes. I wondered why the same operation would be nearly instantaneous sometimes and not others.

Respuesta aceptada

David Parks
David Parks el 25 de Jun. de 2014
All issues were related to windows TDR (timeout detection and recovery). After disabling that, all works fine.
https://www.youtube.com/watch?v=8NtHDkUoN98
I did read in the matlab code that enabling the "Tesla compute cluster mode" might also resolve the issue without mucking with TDR settings (as the CUDA GPU would not be extended to the desktop). I haven't been able to determine how I might go about changing this setting for an NVIDIA GTX 860M, so the question could be extended if someone knows the answer to that.
  1 comentario
Rushikesh Tade
Rushikesh Tade el 21 de Sept. de 2014
Worked like a charm on ASUS ROG G750JM . For simplcity I am uploading registry values.

Iniciar sesión para comentar.

Más respuestas (0)

Categorías

Más información sobre GPU Computing en Help Center y File Exchange.

Etiquetas

Productos

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by