Getting Jumps in mini-batch loss when training YoloV2

Question

ohad a el 2 de Mayo de 2019

0
Enlazar

Enlace directo a esta pregunta

https://es.mathworks.com/matlabcentral/answers/459860-getting-jumps-in-mini-batch-loss-when-training-yolov2

Respondida: Zahra Moayed el 5 de Ag. de 2019

trainingLoss.png

Abrir en MATLAB Online

Hello.

i'm trying to train YOLOV2 on my person detector data set.

For some reason i get big Training loss jumps in the middle of the training. i can also see that the temp checkpoint models files are reducing in size dramatically (e.g - from 59MB to 1.5Mb).

i'm using about 170 pictures with 1-6 bounding box each.

here is the code:

% Define the image input size.
imageSize = [450 800 3];
% Define the number of object classes to detect.
numClasses = width(personDataSet)-1;
anchorBoxes = [
    76  43
    208 147
    103  68
    158 106
    198 137
    129 81
    73 40
];
baseNetwork = resnet50
% Specify the feature extraction layer.
featureLayer = 'activation_49_relu';
analyzeNetwork(baseNetwork);
%reorgLayer = 'activation_47_relu';
% Create the YOLO v2 object detection network. 
% lgraph = yolov2Layers(imageSize,numClasses,anchorBoxes,baseNetwork,featureLayer,'ReorglayerSource',reorgLayer);
lgraph = yolov2Layers(imageSize,numClasses,anchorBoxes,baseNetwork,featureLayer);
 
% Configure the training options. 
    %  * Lower the learning rate to 1e-3 to stabilize training. 
    %  * Set CheckpointPath to save detector checkpoints to a temporary
    %    location. If training is interrupted due to a system failure or
    %    power outage, you can resume training from the saved checkpoint.
    options = trainingOptions('sgdm', ...
        'MiniBatchSize', 34, ...
        'InitialLearnRate',1e-3, ...
        'MaxEpochs',30,...
        'VerboseFrequency',2, ...
        'CheckpointPath', tempdir);
    
        %'LearnRateSchedule','piecewise', ...
        %'LearnRateDropPeriod',10 , ...
        %'Shuffle','every-epoch');    
    
        
        
    % Train YOLO v2 detector.
    [detector,info] = trainYOLOv2ObjectDetector(trainingData,lgraph,options);

as seen in code i also tried with 'LearnRateSchedule' and 'Shuffle' and with different learnRate, batch size and epochs. and also getting same results.

this is an example of the one in code:

Starting parallel pool (parpool) using the 'local' profile ...

Connected to the parallel pool (number of workers: 8).

Training on single CPU.

|========================================================================================|

|========================================================================================|

| 1 | 1 | 00:00:37 | 8.56 | 73.2 | 0.0010 |

| 1 | 2 | 00:01:14 | 3.55 | 12.6 | 0.0010 |

| 1 | 4 | 00:02:27 | 2.15 | 4.6 | 0.0010 |

| 2 | 6 | 00:03:44 | 2.81 | 7.9 | 0.0010 |

| 2 | 8 | 00:04:57 | 2.89 | 8.4 | 0.0010 |

| 2 | 10 | 00:06:10 | 2.91 | 8.5 | 0.0010 |

| 3 | 12 | 00:07:26 | 2.80 | 7.8 | 0.0010 |

| 3 | 14 | 00:08:39 | 2.65 | 7.0 | 0.0010 |

| 4 | 16 | 00:09:55 | 2.18 | 4.7 | 0.0010 |

| 4 | 18 | 00:11:08 | 2.23 | 5.0 | 0.0010 |

| 4 | 20 | 00:12:21 | 2.32 | 5.4 | 0.0010 |

| 5 | 22 | 00:13:37 | 2.40 | 5.8 | 0.0010 |

| 5 | 24 | 00:14:50 | 2.42 | 5.9 | 0.0010 |

| 6 | 26 | 00:16:06 | 2.53 | 6.4 | 0.0010 |

| 6 | 28 | 00:17:18 | 2.59 | 6.7 | 0.0010 |

| 6 | 30 | 00:18:31 | 2.37 | 5.6 | 0.0010 |

| 7 | 32 | 00:19:47 | 2.29 | 5.2 | 0.0010 |

| 7 | 34 | 00:20:59 | 2.34 | 5.5 | 0.0010 |

| 8 | 36 | 00:22:15 | 2.24 | 5.0 | 0.0010 |

| 8 | 38 | 00:23:28 | 2.69 | 7.2 | 0.0010 |

| 8 | 40 | 00:24:41 | 2.86 | 8.2 | 0.0010 |

| 9 | 42 | 00:25:56 | 1.63 | 2.7 | 0.0010 |

| 9 | 44 | 00:27:09 | 1.71 | 2.9 | 0.0010 |

| 10 | 46 | 00:28:25 | 1.65 | 2.7 | 0.0010 |

| 10 | 48 | 00:29:37 | 1.68 | 2.8 | 0.0010 |

| 10 | 50 | 00:30:50 | 1.65 | 2.7 | 0.0010 |

| 11 | 52 | 00:32:07 | 1.68 | 2.8 | 0.0010 |

| 11 | 54 | 00:33:20 | 1.71 | 2.9 | 0.0010 |

| 12 | 56 | 00:34:35 | 1.65 | 2.7 | 0.0010 |

| 12 | 58 | 00:35:47 | 1.63 | 2.7 | 0.0010 |

| 12 | 60 | 00:36:58 | 1.62 | 2.6 | 0.0010 |

| 13 | 62 | 00:38:13 | 1.70 | 2.9 | 0.0010 |

| 13 | 64 | 00:39:25 | 1.79 | 3.2 | 0.0010 |

| 14 | 66 | 00:40:40 | 1.66 | 2.8 | 0.0010 |

| 14 | 68 | 00:41:52 | 1.66 | 2.7 | 0.0010 |

| 14 | 70 | 00:43:04 | 2.08 | 4.3 | 0.0010 |

| 15 | 72 | 00:44:19 | 4.30 | 18.5 | 0.0010 |

| 15 | 74 | 00:45:30 | 9.76 | 95.2 | 0.0010 |

| 16 | 76 | 00:46:42 | 9.08 | 82.5 | 0.0010 |

| 16 | 78 | 00:47:54 | 8.59 | 73.8 | 0.0010 |

| 16 | 80 | 00:49:05 | 8.25 | 68.1 | 0.0010 |

| 17 | 82 | 00:50:17 | 8.10 | 65.6 | 0.0010 |

| 17 | 84 | 00:51:30 | 7.86 | 61.7 | 0.0010 |

| 18 | 86 | 00:52:41 | 7.09 | 50.2 | 0.0010 |

| 18 | 88 | 00:53:52 | 6.51 | 42.3 | 0.0010 |

| 18 | 90 | 00:55:04 | 6.66 | 44.4 | 0.0010 |

| 19 | 92 | 00:56:16 | 6.70 | 45.0 | 0.0010 |

| 19 | 94 | 00:57:27 | 6.65 | 44.2 | 0.0010 |

| 20 | 96 | 00:58:39 | 6.18 | 38.3 | 0.0010 |

| 20 | 98 | 00:59:50 | 5.88 | 34.6 | 0.0010 |

| 20 | 100 | 01:01:01 | 6.15 | 37.8 | 0.0010 |

| 21 | 102 | 01:02:13 | 5.88 | 34.5 | 0.0010 |

| 21 | 104 | 01:03:25 | 6.09 | 37.0 | 0.0010 |

| 22 | 106 | 01:04:37 | 6.14 | 37.7 | 0.0010 |

| 22 | 108 | 01:05:48 | 5.12 | 26.2 | 0.0010 |

| 22 | 110 | 01:06:59 | 5.99 | 35.9 | 0.0010 |

| 23 | 112 | 01:08:10 | 5.95 | 35.4 | 0.0010 |

| 23 | 114 | 01:09:21 | 6.21 | 38.6 | 0.0010 |

| 24 | 116 | 01:10:33 | 6.07 | 36.9 | 0.0010 |

| 24 | 118 | 01:11:44 | 5.80 | 33.7 | 0.0010 |

| 24 | 120 | 01:12:55 | 6.30 | 39.7 | 0.0010 |

| 25 | 122 | 01:14:07 | 5.90 | 34.9 | 0.0010 |

| 25 | 124 | 01:15:18 | 6.17 | 38.0 | 0.0010 |

| 26 | 126 | 01:16:31 | 5.85 | 34.2 | 0.0010 |

| 26 | 128 | 01:17:42 | 5.53 | 30.6 | 0.0010 |

| 26 | 130 | 01:18:53 | 5.91 | 35.0 | 0.0010 |

| 27 | 132 | 01:20:05 | 5.88 | 34.6 | 0.0010 |

| 27 | 134 | 01:21:16 | 6.14 | 37.8 | 0.0010 |

| 28 | 136 | 01:22:28 | 6.03 | 36.4 | 0.0010 |

| 28 | 138 | 01:23:40 | 5.26 | 27.6 | 0.0010 |

| 28 | 140 | 01:24:53 | 5.90 | 34.8 | 0.0010 |

| 29 | 142 | 01:26:04 | 5.86 | 34.3 | 0.0010 |

| 29 | 144 | 01:27:16 | 6.14 | 37.7 | 0.0010 |

| 30 | 146 | 01:28:28 | 5.60 | 31.3 | 0.0010 |

| 30 | 148 | 01:29:40 | 5.76 | 33.2 | 0.0010 |

| 30 | 150 | 01:30:52 | 5.89 | 34.7 | 0.0010 |

|========================================================================================|

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

ping.jiang el 13 de Jun. de 2019

0
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/459860-getting-jumps-in-mini-batch-loss-when-training-yolov2#answer_378981

所以，你的问题是什么呢？

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Answer 2

Zahra Moayed el 5 de Ag. de 2019

0
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/459860-getting-jumps-in-mini-batch-loss-when-training-yolov2#answer_386158

I had the same issue but when I decided to choose [224 224 3] which is the input size of ResNet and then resize the anchorboxes, it finally worked. However it only worked with Single class.

I also used MiniBatchSize =16 and Shuffle=every-epoch but the main change was the input size

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Getting Jumps in mini-batch loss when training YoloV2

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuestas (2)

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

Getting Jumps in mini-batch loss when training YoloV2

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuestas (2)

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos