Neural networks - CUDAKernel​/setConsta​ntMemory - the data supplied is too big for constant 'hintsD'

1 visualización (últimos 30 días)
On R2015a with Parallel Computing Toolbox and Neural Network Toolbox.
Using the following code with GPU Nvidia GeForce GTX980 Ti:
net1 = feedforwardnet(20);
net1.trainFcn = 'trainscg';
x = inputs(1:4284,2:2000)'; % if I reduce this to 2:1900, it will work
t = double(targets'); % casting to double for GPU
t = t(:,1:4284);
% preparing for GPU xg = nndata2gpu(x); tg = nndata2gpu(t);
net1.input.processFcns = {'mapminmax'}; net1.output.processFcns = {'mapminmax'};
net2 = configure(net1,x,t); % Configure with MATLAB arrays
net2 = train(net2,xg,tg);
As you can see, this is not a big dataset. When I run this, it generates this error:
Error using parallel.gpu.CUDAKernel/setConstantMemory The data supplied is too big for constant 'hintsD'.
Error in nnGPU.codeHints (line 33) setConstantMemory(hints.yKernel,'hintsD',hints.double);
Error in nncalc.setup2 (line 13) calcHints = calcMode.codeHints(calcHints);
Error in nncalc.setup (line 17) [calcLib,calcNet] = nncalc.setup2(calcMode,calcNet,calcData,calcHints);
Error in network/train (line 357) [calcLib,calcNet,net,resourceText] = nncalc.setup(calcMode,net,data);
gpuDevice is showing this:
Name: 'GeForce GTX 980 Ti'
Index: 1
ComputeCapability: '5.2'
SupportsDouble: 1
DriverVersion: 8
ToolkitVersion: 6.5000
MaxThreadsPerBlock: 1024
MaxShmemPerBlock: 49152
MaxThreadBlockSize: [1024 1024 64]
MaxGridSize: [2.1475e+09 65535 65535]
SIMDWidth: 32
TotalMemory: 6.4425e+09
AvailableMemory: 5.1520e+09
MultiprocessorCount: 22
ClockRateKHz: 1139500
ComputeMode: 'Default'
GPUOverlapsTransfers: 1
KernelExecutionTimeout: 1
CanMapHostMemory: 1
DeviceSupported: 1
DeviceSelected: 1
As noted in the code above, if I reduce x marginally, it will run.
I don't understand why data of this size would generate a memory error?
Am I missing a step in preparing this for GPU?

Respuesta aceptada

Amanjit Dulai
Amanjit Dulai el 3 de En. de 2017
I was able to reproduce your issue. The best solution is to do the GPU training a different way by using the 'useGPU' flag. This does not use the shared memory in this way, and side-steps this issue. Your example code would look like this:
net1 = feedforwardnet(20);
net1.trainFcn = 'trainscg';
x = inputs(1:4284,2:2000)';
t = double(targets'); % casting to double for GPU
t = t(:,1:4284);
net1.input.processFcns = {'mapminmax'};
net1.output.processFcns = {'mapminmax'};
net1 = train(net1,x,t,'useGPU','yes');

Más respuestas (2)

Joss Knight
Joss Knight el 19 de Dic. de 2016
Constant memory is a special fast read-only cache with 64KB of space. That's enough to store about 8000 elements of double-precision data. Perhaps you want to use shared memory, which will give you 16 or 48MB depending on your device configuration.
  2 comentarios
Jacob Townsend
Jacob Townsend el 19 de Dic. de 2016
Thanks a lot for the suggestion Joss. It seems the default methods are trying to allocate this to Constant Memory (in the error chain, it is running from nncalc all the way through to nnGPU.CodeHints). Are you aware of a way to change these options to default to using shared memory only?
Joss Knight
Joss Knight el 20 de Dic. de 2016
Sorry, I don't understand this code to know what hints.double is and why it needs to be so large. I'll see if I can get someone who knows to help you.

Iniciar sesión para comentar.


Jacob Townsend
Jacob Townsend el 19 de Feb. de 2017
Thanks for that - solved that step!

Categorías

Más información sobre Deep Learning Toolbox en Help Center y File Exchange.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by