How do I substitute all the activation functions of a neural network?

Question

Gianluca Maguolo el 11 de Jul. de 2020

0
Enlazar

Enlace directo a esta pregunta

https://es.mathworks.com/matlabcentral/answers/563414-how-do-i-substitute-all-the-activation-functions-of-a-neural-network

Comentada: Joss Knight el 9 de Ag. de 2020

Hi everyone!

I have an out-of-memory issue when substituiting the activation functions ina neural network with other activation functions. This is the code that I use:

while index < totLayers - removedLayers
    if contains(lower(lgraph.Layers(index).Name),'relu')
        name = lgraph.Layers(index).Name;
        conn = lgraph.Connections;
        for i = 1:size(conn,1)
            if strcmp(conn.Source{i},name)
                out = conn.Destination{i};
            elseif strcmp(conn.Destination{i},name)
                in = conn.Source{i};
            end
        end
        channels = findChannels(lgraph,in);
        %create new activation layers
        newActivationLayers = createActivationLayers(newActivations,channels,index+removedLayers,relativeLearnRate,maxInput);
        lgraph = removeLayers(lgraph,name);
        lgraph = addLayers(lgraph,newActivationLayers);
        lgraph = connectLayers(lgraph,in,newActivationLayers(1).Name);
        lgraph = connectLayers(lgraph,newActivationLayers(end).Name,out);
        removedLayers = removedLayers + length(newActivationLayers);
    else
        index = index + 1;
    end
end
plot(lgraph)

findChannels and createActivationLayers only create the new layer to be inserted in the network at that specific point in the network.

The code seems to work because, when I plot lgraph, the output is correct. However, the GPU goes out of memory at training time. I tried to debug my code by substituting every activation in the network with itself (i.e: leaving lgraph unchanged) and a network that I was able to train on my GPU returns me an out-of-memory problem on the network returned by my code.

The only difference that I could see is that the order of the layers in lgraph.Layers is different from the original one and all the activation layers are at the end. However, the graph is correct and I would be surprised if this was the problem.

Does anyone know why I have this issue?

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

Srivardhan Gadila el 18 de Jul. de 2020

0
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/563414-how-do-i-substitute-all-the-activation-functions-of-a-neural-network#answer_467619

I would suggest you to try training the network on the cpu. It may be possible that the gpu memory is not sufficient for training the new network.

Refer to 'ExecutionEnvironment' name value pair argument in Hardware Options of trainingOptions and set it to 'cpu'.

If you are able to train the network on cpu successfully then try reducing the batch size while training on gpu.

2 comentarios
Mostrar NingunoOcultar Ninguno

Gianluca Maguolo el 21 de Jul. de 2020

Hello, thanks for the answer!

However that does not solve the problem. How is that possibile that a network with the same graph does not give me any memory issue using the same training options? Is it possibile that the order of the layers in lgraph.Layers affects the memory requirements? I guess it shouldn't, but I seems to happen.

Srivardhan Gadila el 21 de Jul. de 2020

The following also might help you: Fix Errors in Training, Training with Multiple GPUs.

Iniciar sesión para comentar.

Answer 2

Joss Knight el 4 de Ag. de 2020

0
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/563414-how-do-i-substitute-all-the-activation-functions-of-a-neural-network#answer_475180

It looks as though you've replaced every relu layer with multiple other layers. This will make your network deeper. The deeper the network, the more memory you need for training - this is the way backpropagation works, it needs to hold onto all the activations from every layer. In addition, we can only guess at the memory requirements of your extra layers, since you don't say what they are.

I wonder what your extra layers are and why you need more than one new layer to replace something as simple as a relu activation.

2 comentarios
Mostrar NingunoOcultar Ninguno

Gianluca Maguolo el 6 de Ag. de 2020

Thank yyou very much for you answer. Those could be anything in the future, at the momentI am trying to make this work... The problem is that I had already implemented a naive algorithm to subistute the activation function in a specific newtork, this code should be a generalization. I already thought about what you said, however I tried to figure out if there weere any bugs and I made some simple tests.

I applied the algorithm above to a network that contained a custom activation that I created. That original network worked well. However, after I apply this new algorithm to substitute my custom function with itself (i.e: no changes expected), then the training goes out of memory even with smaller batch sizes than before. When I plot the layerGraph of the two networks, they are exactly the same. The only change is in the order of lgraph.Layers, but that should not affect training. I wonder if I miss something at a lower level. Maybe trainNetwork uses the memory in a way that is not clear to me and it depends on the order of the layers in lgraph.Layers.

Joss Knight el 9 de Ag. de 2020

Did you delete the first network before training the second network? Try calling reset(gpuDevice) before training the modified network.

Iniciar sesión para comentar.

How do I substitute all the activation functions of a neural network?

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuestas (2)

2 comentarios
Mostrar NingunoOcultar Ninguno

2 comentarios
Mostrar NingunoOcultar Ninguno

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

How do I substitute all the activation functions of a neural network?

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuestas (2)

2 comentarios Mostrar NingunoOcultar Ninguno

2 comentarios Mostrar NingunoOcultar Ninguno

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

2 comentarios
Mostrar NingunoOcultar Ninguno

2 comentarios
Mostrar NingunoOcultar Ninguno