How do I substitute all the activation functions of a neural network?
3 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
Hi everyone!
I have an out-of-memory issue when substituiting the activation functions ina neural network with other activation functions. This is the code that I use:
while index < totLayers - removedLayers
if contains(lower(lgraph.Layers(index).Name),'relu')
name = lgraph.Layers(index).Name;
conn = lgraph.Connections;
for i = 1:size(conn,1)
if strcmp(conn.Source{i},name)
out = conn.Destination{i};
elseif strcmp(conn.Destination{i},name)
in = conn.Source{i};
end
end
channels = findChannels(lgraph,in);
%create new activation layers
newActivationLayers = createActivationLayers(newActivations,channels,index+removedLayers,relativeLearnRate,maxInput);
lgraph = removeLayers(lgraph,name);
lgraph = addLayers(lgraph,newActivationLayers);
lgraph = connectLayers(lgraph,in,newActivationLayers(1).Name);
lgraph = connectLayers(lgraph,newActivationLayers(end).Name,out);
removedLayers = removedLayers + length(newActivationLayers);
else
index = index + 1;
end
end
plot(lgraph)
findChannels and createActivationLayers only create the new layer to be inserted in the network at that specific point in the network.
The code seems to work because, when I plot lgraph, the output is correct. However, the GPU goes out of memory at training time. I tried to debug my code by substituting every activation in the network with itself (i.e: leaving lgraph unchanged) and a network that I was able to train on my GPU returns me an out-of-memory problem on the network returned by my code.
The only difference that I could see is that the order of the layers in lgraph.Layers is different from the original one and all the activation layers are at the end. However, the graph is correct and I would be surprised if this was the problem.
Does anyone know why I have this issue?
0 comentarios
Respuestas (2)
Srivardhan Gadila
el 18 de Jul. de 2020
I would suggest you to try training the network on the cpu. It may be possible that the gpu memory is not sufficient for training the new network.
Refer to 'ExecutionEnvironment' name value pair argument in Hardware Options of trainingOptions and set it to 'cpu'.
If you are able to train the network on cpu successfully then try reducing the batch size while training on gpu.
Joss Knight
el 4 de Ag. de 2020
It looks as though you've replaced every relu layer with multiple other layers. This will make your network deeper. The deeper the network, the more memory you need for training - this is the way backpropagation works, it needs to hold onto all the activations from every layer. In addition, we can only guess at the memory requirements of your extra layers, since you don't say what they are.
I wonder what your extra layers are and why you need more than one new layer to replace something as simple as a relu activation.
2 comentarios
Joss Knight
el 9 de Ag. de 2020
Did you delete the first network before training the second network? Try calling reset(gpuDevice) before training the modified network.
Ver también
Categorías
Más información sobre Image Data Workflows en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!