How to avoid NaN in the Mini-batch-loss from traning convolutional neural network?

Question

AlexanderTUE el 27 de Abr. de 2017

2
Enlazar

Enlace directo a esta pregunta

https://es.mathworks.com/matlabcentral/answers/337587-how-to-avoid-nan-in-the-mini-batch-loss-from-traning-convolutional-neural-network

Comentada: Kelvin Owusu el 8 de En. de 2024

Hello,

I´m working on training a convolutional neural network following the example from https://de.mathworks.com/help/nnet/examples/create-simple-deep-learning-network-for-classification.html. I have 2000 images in each of 8 label categories and use 90% for training and 10% for testing. The images are in .jpg format and have a size 512x512x1. The arciture of the CNN is currently as follows:

          layers = [imageInputLayer([512 512 1])
                   convolution2dLayer(5,15)
                   reluLayer
                   maxPooling2dLayer(2,'Stride',2)
                   fullyConnectedLayer(8)
                   softmaxLayer
                   classificationLayer()];
         options = traningOptions('sgdm','MaxEpochs',15,'InitialLearnRate',0.001, 'ExecutionEnvironment', 'parallel' );

After training the first epoch the mini-batch loss is going to be NaN and the accuracy is around the chance level. The reason for this is probably that the back probagating generates NaN weights.

How can I avoid this problem? Thanks for the answers!

9 comentarios
Mostrar 7 comentarios más antiguosOcultar 7 comentarios más antiguos

Javier Pinzón el 25 de Jul. de 2017

Editada: Javier Pinzón el 25 de Jul. de 2017

Hello everybody,

Because i have been experienced some issues with PNG format images, I highlight recommend to use JPG/JPEG format, that is because sometimes, due to some layers that a PNG image has, it take the last layer and the image becomes the color of this layer, i.e., all the image is converted to a black or red... image. so, when you send these image to the network, it only will se one color image... nothing related to the rest of the images and the network will not be able to learn the features. Also be careful with the size of your filters. Also Johannes answer might be a solution in some cases.

Hope it helps,

Javier

_____________________

Edit:

Be careful with the size of your input image... When it is really big, as happened with Alexander, using only one convolution will be really difficult to the networ to learn, because will have only one structure of weights for a really big amount of features that the network want to learn. I would recomend use at least 2 or 3 convolution for that size, even a size of 128x128, and to use Pooling layers to reduce the size that will enter to the Fully-conneced layer, because it will help but to classify the features extracted.

Javier Pinzón el 6 de Sept. de 2017

Hello Alexander,

I'm happy that you where able to solve with your problem. Any Question be free to ask.

Greg Heath el 7 de Sept. de 2017

Comment by Ashok kumar on 6 Jun 2017

MOVED FROM AN ACCEPTED ANSWER BOX

What is the mini batch loss in the table in command window and how it is calculated ??

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

Javier Pinzón el 8 de Sept. de 2017

4
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/337587-how-to-avoid-nan-in-the-mini-batch-loss-from-traning-convolutional-neural-network#answer_280767

Abrir en MATLAB Online

I will provie the best comments as an answer that can help to solve this problem o NaN Accuracy:

Hello everybody,

Because i have been experienced some issues with PNG format images, I highlight recommend to use JPG/JPEG format, that is because sometimes, due to some layers that a PNG image has, it take the last layer and the image becomes the color of this layer, i.e., all the image is converted to a black or red... image. so, when you send these image to the network, it only will se one color image... nothing related to the rest of the images and the network will not be able to learn the features. Also be careful with the size of your filters. Also Johannes answer might be a solution in some cases.

___________________

Be careful with the size of your input image... When it is really big, as happened with Alexander, using only one convolution will be really difficult to the network to learn, because will have only one structure of weights for a really big amount of features that the network want to learn. I would recomend use at least 2 or 3 convolution for that size, even a size of 128x128, and to use Pooling layers to reduce the size that will enter to the Fully-conneced layer, because it will help but to classify the features extracted.

___________________

To initialize the weights, you need to define the convolution layer before the Layer struct:

conv1 = convolution2dLayer(F,D,'Padding',0,...
                     'BiasLearnRateFactor',2,...
                     'name','conv1');
conv1.Weights = gpuArray(single(randn([F F 3 D])*0.0001));
conv1.Bias = gpuArray(single(randn([1 1 D])*0.00001+1));

You can initialize weights and the bias if needed. Remember, D is the amount of Filters to be used and F the size of the filter. Then, call your variable in the layer struct

layers = [ ...
    imageInputLayer([128 128 3]);
    conv1;

and that is all.

Hope it helps,

Javier

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Answer 2

Khalid Babutain el 18 de Oct. de 2019

4
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/337587-how-to-avoid-nan-in-the-mini-batch-loss-from-traning-convolutional-neural-network#answer_396987

I came across this issue because I had it, and I was able to solve it by only lowering the Initial Learning Rate from ('InitialLearnRate',1e-3) to ('InitialLearnRate',1e-5)

3 comentarios
Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo

Matthew Luka el 9 de Feb. de 2022

Thanks alot. it worked for me too

Kelvin Owusu el 8 de En. de 2024

Yh this works l and helped more than expected. I was getting 29% accuracies changing paramaters, but reset them and reduced the the LearnRate I am getting 100% 👏

Iniciar sesión para comentar.

Answer 3

Salma Hassan el 20 de Dic. de 2017

0
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/337587-how-to-avoid-nan-in-the-mini-batch-loss-from-traning-convolutional-neural-network#answer_297137

i have this line in my code [trainedNet,traininfo] = trainNetwork(trainingimages,Layers,opts); when i opened the structure traininfo i got the values of training accuracy and training loss but in the validation (accuracy , loss) i got only the first value and the rest is nan.. what is the problem in this case ??

2 comentarios
Mostrar NingunoOcultar Ninguno

Javier Pinzón el 21 de Dic. de 2017

hello As,

Please can you open a new threat with this question, and provide screen shoots of your code and results as posible, to se what may be wrong or what is causing the problem?

Thanks

Salma Hassan el 31 de Dic. de 2017

Mr Javier Pinzón, i did a separate question with title "what is causes NaN values in the validation accuracy and loss from traning convolutional neural network and how to avoid it? " in this link https://www.mathworks.com/matlabcentral/answers/375090-what-is-causes-nan-values-in-the-validation-accuracy-and-loss-from-traning-convolutional-neural-n

Iniciar sesión para comentar.

Answer 4

Poorya Khanali el 10 de Feb. de 2021

0
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/337587-how-to-avoid-nan-in-the-mini-batch-loss-from-traning-convolutional-neural-network#answer_619897

I have a ResNet when the image size is 35*60 everything works fine (no NaN during the training), but when I change the image size to 59*60 (for different data) the network at the beginning seems to work, but after some epochs the NaN starts to appear. Could you please help me out!

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Answer 5

Matt J el 5 de Dic. de 2023

0
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/337587-how-to-avoid-nan-in-the-mini-batch-loss-from-traning-convolutional-neural-network#answer_1366139

I found that changing the solver (from "sgdm" to "adam") resolved the problem.

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

How to avoid NaN in the Mini-batch-loss from traning convolutional neural network?

9 comentarios
Mostrar 7 comentarios más antiguosOcultar 7 comentarios más antiguos

Respuesta aceptada

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Más respuestas (4)

3 comentarios
Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo

2 comentarios
Mostrar NingunoOcultar Ninguno

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Ver también

Categorías

Etiquetas

Community Treasure Hunt

How to avoid NaN in the Mini-batch-loss from traning convolutional neural network?

9 comentarios Mostrar 7 comentarios más antiguosOcultar 7 comentarios más antiguos

Respuesta aceptada

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Más respuestas (4)

3 comentarios Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo

2 comentarios Mostrar NingunoOcultar Ninguno

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Ver también

Categorías

Etiquetas

Community Treasure Hunt

9 comentarios
Mostrar 7 comentarios más antiguosOcultar 7 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

3 comentarios
Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo

2 comentarios
Mostrar NingunoOcultar Ninguno

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos