Main Content

trainSSDObjectDetector

Train an SSD deep learning object detector

Description

Train a Detector

example

trainedDetector = trainSSDObjectDetector(trainingData,lgraph,options) trains a single shot multibox detector (SSD) using deep learning. You can train an SSD detector to detect multiple object classes.

This function requires that you have Deep Learning Toolbox™. It is recommended that you also have Parallel Computing Toolbox™ to use with a CUDA®-enabled NVIDIA® GPU. For information about the supported compute capabilities, see GPU Support by Release (Parallel Computing Toolbox).

[trainedDetector,info] = trainSSDObjectDetector(___) also returns information on the training progress, such as training loss and accuracy, for each iteration.

Resume Training a Detector

trainedDetector = trainSSDObjectDetector(trainingData,checkpoint,options) resumes training from a detector checkpoint.

Fine-Tune a Detector

trainedDetector = trainSSDObjectDetector(trainingData,detector,options) continues training an SSD multibox object detector with additional fine-tuning options. Use this syntax with additional training data or to perform more training iterations to improve detector accuracy.

Additional Properties

trainedDetector = trainSSDObjectDetector(___,Name,Value) uses additional options specified by one or more Name,Value pair arguments and any of the previous inputs.

Examples

collapse all

Load the training data for vehicle detection into the workspace.

data = load('vehicleTrainingData.mat');
trainingData = data.vehicleTrainingData;

Specify the directory in which training samples are stored. Add full path to the file names in training data.

dataDir = fullfile(toolboxdir('vision'),'visiondata');
trainingData.imageFilename = fullfile(dataDir,trainingData.imageFilename);

Create an image datastore using the files from the table.

imds = imageDatastore(trainingData.imageFilename);

Create a box label datastore using the label columns from the table.

blds = boxLabelDatastore(trainingData(:,2:end));

Combine the datastores.

ds = combine(imds,blds);

Load a preinitialized SSD object detection network.

net = load('ssdVehicleDetector.mat');
lgraph = net.lgraph
lgraph = 
  LayerGraph with properties:

         Layers: [132×1 nnet.cnn.layer.Layer]
    Connections: [141×2 table]
     InputNames: {'input_1'}
    OutputNames: {'focal_loss'  'rcnnboxRegression'}

Inspect the layers in the SSD network and their properties. You can also create the SSD network by following the steps given in Create SSD Object Detection Network.

lgraph.Layers
ans = 
  132×1 Layer array with layers:

     1   'input_1'                           Image Input             224×224×3 images with 'zscore' normalization
     2   'Conv1'                             Convolution             32 3×3×3 convolutions with stride [2  2] and padding 'same'
     3   'bn_Conv1'                          Batch Normalization     Batch normalization with 32 channels
     4   'Conv1_relu'                        Clipped ReLU            Clipped ReLU with ceiling 6
     5   'expanded_conv_depthwise'           Grouped Convolution     32 groups of 1 3×3×1 convolutions with stride [1  1] and padding 'same'
     6   'expanded_conv_depthwise_BN'        Batch Normalization     Batch normalization with 32 channels
     7   'expanded_conv_depthwise_relu'      Clipped ReLU            Clipped ReLU with ceiling 6
     8   'expanded_conv_project'             Convolution             16 1×1×32 convolutions with stride [1  1] and padding 'same'
     9   'expanded_conv_project_BN'          Batch Normalization     Batch normalization with 16 channels
    10   'block_1_expand'                    Convolution             96 1×1×16 convolutions with stride [1  1] and padding 'same'
    11   'block_1_expand_BN'                 Batch Normalization     Batch normalization with 96 channels
    12   'block_1_expand_relu'               Clipped ReLU            Clipped ReLU with ceiling 6
    13   'block_1_depthwise'                 Grouped Convolution     96 groups of 1 3×3×1 convolutions with stride [2  2] and padding 'same'
    14   'block_1_depthwise_BN'              Batch Normalization     Batch normalization with 96 channels
    15   'block_1_depthwise_relu'            Clipped ReLU            Clipped ReLU with ceiling 6
    16   'block_1_project'                   Convolution             24 1×1×96 convolutions with stride [1  1] and padding 'same'
    17   'block_1_project_BN'                Batch Normalization     Batch normalization with 24 channels
    18   'block_2_expand'                    Convolution             144 1×1×24 convolutions with stride [1  1] and padding 'same'
    19   'block_2_expand_BN'                 Batch Normalization     Batch normalization with 144 channels
    20   'block_2_expand_relu'               Clipped ReLU            Clipped ReLU with ceiling 6
    21   'block_2_depthwise'                 Grouped Convolution     144 groups of 1 3×3×1 convolutions with stride [1  1] and padding 'same'
    22   'block_2_depthwise_BN'              Batch Normalization     Batch normalization with 144 channels
    23   'block_2_depthwise_relu'            Clipped ReLU            Clipped ReLU with ceiling 6
    24   'block_2_project'                   Convolution             24 1×1×144 convolutions with stride [1  1] and padding 'same'
    25   'block_2_project_BN'                Batch Normalization     Batch normalization with 24 channels
    26   'block_2_add'                       Addition                Element-wise addition of 2 inputs
    27   'block_3_expand'                    Convolution             144 1×1×24 convolutions with stride [1  1] and padding 'same'
    28   'block_3_expand_BN'                 Batch Normalization     Batch normalization with 144 channels
    29   'block_3_expand_relu'               Clipped ReLU            Clipped ReLU with ceiling 6
    30   'block_3_depthwise'                 Grouped Convolution     144 groups of 1 3×3×1 convolutions with stride [2  2] and padding 'same'
    31   'block_3_depthwise_BN'              Batch Normalization     Batch normalization with 144 channels
    32   'block_3_depthwise_relu'            Clipped ReLU            Clipped ReLU with ceiling 6
    33   'block_3_project'                   Convolution             32 1×1×144 convolutions with stride [1  1] and padding 'same'
    34   'block_3_project_BN'                Batch Normalization     Batch normalization with 32 channels
    35   'block_4_expand'                    Convolution             192 1×1×32 convolutions with stride [1  1] and padding 'same'
    36   'block_4_expand_BN'                 Batch Normalization     Batch normalization with 192 channels
    37   'block_4_expand_relu'               Clipped ReLU            Clipped ReLU with ceiling 6
    38   'block_4_depthwise'                 Grouped Convolution     192 groups of 1 3×3×1 convolutions with stride [1  1] and padding 'same'
    39   'block_4_depthwise_BN'              Batch Normalization     Batch normalization with 192 channels
    40   'block_4_depthwise_relu'            Clipped ReLU            Clipped ReLU with ceiling 6
    41   'block_4_project'                   Convolution             32 1×1×192 convolutions with stride [1  1] and padding 'same'
    42   'block_4_project_BN'                Batch Normalization     Batch normalization with 32 channels
    43   'block_4_add'                       Addition                Element-wise addition of 2 inputs
    44   'block_5_expand'                    Convolution             192 1×1×32 convolutions with stride [1  1] and padding 'same'
    45   'block_5_expand_BN'                 Batch Normalization     Batch normalization with 192 channels
    46   'block_5_expand_relu'               Clipped ReLU            Clipped ReLU with ceiling 6
    47   'block_5_depthwise'                 Grouped Convolution     192 groups of 1 3×3×1 convolutions with stride [1  1] and padding 'same'
    48   'block_5_depthwise_BN'              Batch Normalization     Batch normalization with 192 channels
    49   'block_5_depthwise_relu'            Clipped ReLU            Clipped ReLU with ceiling 6
    50   'block_5_project'                   Convolution             32 1×1×192 convolutions with stride [1  1] and padding 'same'
    51   'block_5_project_BN'                Batch Normalization     Batch normalization with 32 channels
    52   'block_5_add'                       Addition                Element-wise addition of 2 inputs
    53   'block_6_expand'                    Convolution             192 1×1×32 convolutions with stride [1  1] and padding 'same'
    54   'block_6_expand_BN'                 Batch Normalization     Batch normalization with 192 channels
    55   'block_6_expand_relu'               Clipped ReLU            Clipped ReLU with ceiling 6
    56   'block_6_depthwise'                 Grouped Convolution     192 groups of 1 3×3×1 convolutions with stride [2  2] and padding 'same'
    57   'block_6_depthwise_BN'              Batch Normalization     Batch normalization with 192 channels
    58   'block_6_depthwise_relu'            Clipped ReLU            Clipped ReLU with ceiling 6
    59   'block_6_project'                   Convolution             64 1×1×192 convolutions with stride [1  1] and padding 'same'
    60   'block_6_project_BN'                Batch Normalization     Batch normalization with 64 channels
    61   'block_7_expand'                    Convolution             384 1×1×64 convolutions with stride [1  1] and padding 'same'
    62   'block_7_expand_BN'                 Batch Normalization     Batch normalization with 384 channels
    63   'block_7_expand_relu'               Clipped ReLU            Clipped ReLU with ceiling 6
    64   'block_7_depthwise'                 Grouped Convolution     384 groups of 1 3×3×1 convolutions with stride [1  1] and padding 'same'
    65   'block_7_depthwise_BN'              Batch Normalization     Batch normalization with 384 channels
    66   'block_7_depthwise_relu'            Clipped ReLU            Clipped ReLU with ceiling 6
    67   'block_7_project'                   Convolution             64 1×1×384 convolutions with stride [1  1] and padding 'same'
    68   'block_7_project_BN'                Batch Normalization     Batch normalization with 64 channels
    69   'block_7_add'                       Addition                Element-wise addition of 2 inputs
    70   'block_8_expand'                    Convolution             384 1×1×64 convolutions with stride [1  1] and padding 'same'
    71   'block_8_expand_BN'                 Batch Normalization     Batch normalization with 384 channels
    72   'block_8_expand_relu'               Clipped ReLU            Clipped ReLU with ceiling 6
    73   'block_8_depthwise'                 Grouped Convolution     384 groups of 1 3×3×1 convolutions with stride [1  1] and padding 'same'
    74   'block_8_depthwise_BN'              Batch Normalization     Batch normalization with 384 channels
    75   'block_8_depthwise_relu'            Clipped ReLU            Clipped ReLU with ceiling 6
    76   'block_8_project'                   Convolution             64 1×1×384 convolutions with stride [1  1] and padding 'same'
    77   'block_8_project_BN'                Batch Normalization     Batch normalization with 64 channels
    78   'block_8_add'                       Addition                Element-wise addition of 2 inputs
    79   'block_9_expand'                    Convolution             384 1×1×64 convolutions with stride [1  1] and padding 'same'
    80   'block_9_expand_BN'                 Batch Normalization     Batch normalization with 384 channels
    81   'block_9_expand_relu'               Clipped ReLU            Clipped ReLU with ceiling 6
    82   'block_9_depthwise'                 Grouped Convolution     384 groups of 1 3×3×1 convolutions with stride [1  1] and padding 'same'
    83   'block_9_depthwise_BN'              Batch Normalization     Batch normalization with 384 channels
    84   'block_9_depthwise_relu'            Clipped ReLU            Clipped ReLU with ceiling 6
    85   'block_9_project'                   Convolution             64 1×1×384 convolutions with stride [1  1] and padding 'same'
    86   'block_9_project_BN'                Batch Normalization     Batch normalization with 64 channels
    87   'block_9_add'                       Addition                Element-wise addition of 2 inputs
    88   'block_10_expand'                   Convolution             384 1×1×64 convolutions with stride [1  1] and padding 'same'
    89   'block_10_expand_BN'                Batch Normalization     Batch normalization with 384 channels
    90   'block_10_expand_relu'              Clipped ReLU            Clipped ReLU with ceiling 6
    91   'block_10_depthwise'                Grouped Convolution     384 groups of 1 3×3×1 convolutions with stride [1  1] and padding 'same'
    92   'block_10_depthwise_BN'             Batch Normalization     Batch normalization with 384 channels
    93   'block_10_depthwise_relu'           Clipped ReLU            Clipped ReLU with ceiling 6
    94   'block_10_project'                  Convolution             96 1×1×384 convolutions with stride [1  1] and padding 'same'
    95   'block_10_project_BN'               Batch Normalization     Batch normalization with 96 channels
    96   'block_11_expand'                   Convolution             576 1×1×96 convolutions with stride [1  1] and padding 'same'
    97   'block_11_expand_BN'                Batch Normalization     Batch normalization with 576 channels
    98   'block_11_expand_relu'              Clipped ReLU            Clipped ReLU with ceiling 6
    99   'block_11_depthwise'                Grouped Convolution     576 groups of 1 3×3×1 convolutions with stride [1  1] and padding 'same'
    100   'block_11_depthwise_BN'             Batch Normalization     Batch normalization with 576 channels
    101   'block_11_depthwise_relu'           Clipped ReLU            Clipped ReLU with ceiling 6
    102   'block_11_project'                  Convolution             96 1×1×576 convolutions with stride [1  1] and padding 'same'
    103   'block_11_project_BN'               Batch Normalization     Batch normalization with 96 channels
    104   'block_11_add'                      Addition                Element-wise addition of 2 inputs
    105   'block_12_expand'                   Convolution             576 1×1×96 convolutions with stride [1  1] and padding 'same'
    106   'block_12_expand_BN'                Batch Normalization     Batch normalization with 576 channels
    107   'block_12_expand_relu'              Clipped ReLU            Clipped ReLU with ceiling 6
    108   'block_12_depthwise'                Grouped Convolution     576 groups of 1 3×3×1 convolutions with stride [1  1] and padding 'same'
    109   'block_12_depthwise_BN'             Batch Normalization     Batch normalization with 576 channels
    110   'block_12_depthwise_relu'           Clipped ReLU            Clipped ReLU with ceiling 6
    111   'block_12_project'                  Convolution             96 1×1×576 convolutions with stride [1  1] and padding 'same'
    112   'block_12_project_BN'               Batch Normalization     Batch normalization with 96 channels
    113   'block_12_add'                      Addition                Element-wise addition of 2 inputs
    114   'block_13_expand'                   Convolution             576 1×1×96 convolutions with stride [1  1] and padding 'same'
    115   'block_13_expand_BN'                Batch Normalization     Batch normalization with 576 channels
    116   'block_13_expand_relu'              Clipped ReLU            Clipped ReLU with ceiling 6
    117   'block_13_depthwise'                Grouped Convolution     576 groups of 1 3×3×1 convolutions with stride [2  2] and padding 'same'
    118   'block_13_depthwise_BN'             Batch Normalization     Batch normalization with 576 channels
    119   'block_13_depthwise_relu'           Clipped ReLU            Clipped ReLU with ceiling 6
    120   'block_13_project'                  Convolution             160 1×1×576 convolutions with stride [1  1] and padding 'same'
    121   'block_13_project_BN'               Batch Normalization     Batch normalization with 160 channels
    122   'block_13_project_BN_anchorbox1'    Anchor Box Layer.       Anchor Box Layer.
    123   'block_13_project_BN_mbox_conf_1'   Convolution             10 3×3 convolutions with stride [1  1] and padding [1  1  1  1]
    124   'block_13_project_BN_mbox_loc_1'    Convolution             20 3×3 convolutions with stride [1  1] and padding [1  1  1  1]
    125   'block_10_project_BN_anchorbox2'    Anchor Box Layer.       Anchor Box Layer.
    126   'block_10_project_BN_mbox_conf_1'   Convolution             10 3×3 convolutions with stride [1  1] and padding [1  1  1  1]
    127   'block_10_project_BN_mbox_loc_1'    Convolution             20 3×3 convolutions with stride [1  1] and padding [1  1  1  1]
    128   'confmerge'                         SSD Merge Layer.        SSD Merge Layer.
    129   'locmerge'                          SSD Merge Layer.        SSD Merge Layer.
    130   'anchorBoxSoft'                     Softmax                 softmax
    131   'focal_loss'                        Focal Loss Layer.       Focal Loss Layer.
    132   'rcnnboxRegression'                 Box Regression Output   smooth-l1 loss

Configure the network training options.

options = trainingOptions('sgdm',...
          'InitialLearnRate',5e-5,...
          'MiniBatchSize',16,...
          'Verbose',true,...
          'MaxEpochs',50,...
          'Shuffle','every-epoch',...
          'VerboseFrequency',10,...
          'CheckpointPath',tempdir);

Train the SSD network.

[detector,info] = trainSSDObjectDetector(ds,lgraph,options);
*************************************************************************
Training an SSD Object Detector for the following object classes:

* vehicle

Training on single CPU.
Initializing input data normalization.
|=======================================================================================================|
|  Epoch  |  Iteration  |  Time Elapsed  |  Mini-batch  |  Mini-batch  |  Mini-batch  |  Base Learning  |
|         |             |   (hh:mm:ss)   |     Loss     |   Accuracy   |     RMSE     |      Rate       |
|=======================================================================================================|
|       1 |           1 |       00:00:18 |       0.8757 |       48.51% |         1.47 |      5.0000e-05 |
|       1 |          10 |       00:01:43 |       0.8386 |       48.35% |         1.43 |      5.0000e-05 |
|       2 |          20 |       00:03:15 |       0.7860 |       48.87% |         1.37 |      5.0000e-05 |
|       2 |          30 |       00:04:42 |       0.6771 |       48.65% |         1.23 |      5.0000e-05 |
|       3 |          40 |       00:06:46 |       0.7129 |       48.43% |         1.28 |      5.0000e-05 |
|       3 |          50 |       00:08:37 |       0.5723 |       49.04% |         1.09 |      5.0000e-05 |
|       4 |          60 |       00:10:12 |       0.5632 |       48.72% |         1.08 |      5.0000e-05 |
|       4 |          70 |       00:11:25 |       0.5438 |       49.11% |         1.06 |      5.0000e-05 |
|       5 |          80 |       00:12:35 |       0.5277 |       48.48% |         1.03 |      5.0000e-05 |
|       5 |          90 |       00:13:39 |       0.4711 |       48.95% |         0.96 |      5.0000e-05 |
|       6 |         100 |       00:14:50 |       0.5063 |       48.72% |         1.00 |      5.0000e-05 |
|       7 |         110 |       00:16:11 |       0.4812 |       48.99% |         0.97 |      5.0000e-05 |
|       7 |         120 |       00:17:27 |       0.5248 |       48.53% |         1.04 |      5.0000e-05 |
|       8 |         130 |       00:18:33 |       0.4245 |       49.32% |         0.90 |      5.0000e-05 |
|       8 |         140 |       00:19:37 |       0.4889 |       48.87% |         0.98 |      5.0000e-05 |
|       9 |         150 |       00:20:47 |       0.4213 |       49.18% |         0.89 |      5.0000e-05 |
|       9 |         160 |       00:22:02 |       0.4753 |       49.45% |         0.97 |      5.0000e-05 |
|      10 |         170 |       00:23:13 |       0.4454 |       49.31% |         0.92 |      5.0000e-05 |
|      10 |         180 |       00:24:22 |       0.4378 |       49.26% |         0.92 |      5.0000e-05 |
|      11 |         190 |       00:25:29 |       0.4278 |       49.13% |         0.90 |      5.0000e-05 |
|      12 |         200 |       00:26:39 |       0.4494 |       49.77% |         0.93 |      5.0000e-05 |
|      12 |         210 |       00:27:45 |       0.4298 |       49.03% |         0.90 |      5.0000e-05 |
|      13 |         220 |       00:28:47 |       0.4296 |       49.86% |         0.90 |      5.0000e-05 |
|      13 |         230 |       00:30:05 |       0.3987 |       49.65% |         0.86 |      5.0000e-05 |
|      14 |         240 |       00:31:13 |       0.4042 |       49.46% |         0.87 |      5.0000e-05 |
|      14 |         250 |       00:32:20 |       0.4244 |       50.16% |         0.90 |      5.0000e-05 |
|      15 |         260 |       00:33:31 |       0.4374 |       49.72% |         0.93 |      5.0000e-05 |
|      15 |         270 |       00:34:38 |       0.4016 |       48.95% |         0.86 |      5.0000e-05 |
|      16 |         280 |       00:35:47 |       0.4289 |       49.44% |         0.91 |      5.0000e-05 |
|      17 |         290 |       00:36:58 |       0.3866 |       49.10% |         0.84 |      5.0000e-05 |
|      17 |         300 |       00:38:10 |       0.4077 |       49.59% |         0.87 |      5.0000e-05 |
|      18 |         310 |       00:39:24 |       0.3943 |       49.74% |         0.86 |      5.0000e-05 |
|      18 |         320 |       00:40:48 |       0.4206 |       49.99% |         0.89 |      5.0000e-05 |
|      19 |         330 |       00:41:53 |       0.4504 |       49.72% |         0.94 |      5.0000e-05 |
|      19 |         340 |       00:42:55 |       0.3449 |       50.38% |         0.78 |      5.0000e-05 |
|      20 |         350 |       00:44:01 |       0.3450 |       49.57% |         0.77 |      5.0000e-05 |
|      20 |         360 |       00:44:59 |       0.3769 |       50.24% |         0.83 |      5.0000e-05 |
|      21 |         370 |       00:46:05 |       0.3336 |       50.40% |         0.76 |      5.0000e-05 |
|      22 |         380 |       00:47:01 |       0.3453 |       49.27% |         0.78 |      5.0000e-05 |
|      22 |         390 |       00:48:04 |       0.4011 |       49.72% |         0.87 |      5.0000e-05 |
|      23 |         400 |       00:49:06 |       0.3307 |       50.32% |         0.75 |      5.0000e-05 |
|      23 |         410 |       00:50:03 |       0.3186 |       50.01% |         0.73 |      5.0000e-05 |
|      24 |         420 |       00:51:10 |       0.3491 |       50.43% |         0.78 |      5.0000e-05 |
|      24 |         430 |       00:52:17 |       0.3299 |       50.31% |         0.76 |      5.0000e-05 |
|      25 |         440 |       00:53:35 |       0.3326 |       50.78% |         0.76 |      5.0000e-05 |
|      25 |         450 |       00:54:42 |       0.3219 |       50.61% |         0.75 |      5.0000e-05 |
|      26 |         460 |       00:55:55 |       0.3090 |       50.59% |         0.71 |      5.0000e-05 |
|      27 |         470 |       00:57:08 |       0.3036 |       51.48% |         0.71 |      5.0000e-05 |
|      27 |         480 |       00:58:16 |       0.3359 |       50.43% |         0.76 |      5.0000e-05 |
|      28 |         490 |       00:59:24 |       0.3182 |       50.35% |         0.73 |      5.0000e-05 |
|      28 |         500 |       01:00:36 |       0.3265 |       50.71% |         0.76 |      5.0000e-05 |
|      29 |         510 |       01:01:44 |       0.3415 |       50.53% |         0.78 |      5.0000e-05 |
|      29 |         520 |       01:02:51 |       0.3126 |       51.15% |         0.73 |      5.0000e-05 |
|      30 |         530 |       01:03:59 |       0.3179 |       50.74% |         0.75 |      5.0000e-05 |
|      30 |         540 |       01:05:15 |       0.3032 |       50.83% |         0.72 |      5.0000e-05 |
|      31 |         550 |       01:06:25 |       0.2868 |       50.69% |         0.68 |      5.0000e-05 |
|      32 |         560 |       01:07:42 |       0.2716 |       50.85% |         0.66 |      5.0000e-05 |
|      32 |         570 |       01:08:53 |       0.3016 |       51.32% |         0.71 |      5.0000e-05 |
|      33 |         580 |       01:10:05 |       0.2624 |       51.35% |         0.63 |      5.0000e-05 |
|      33 |         590 |       01:11:12 |       0.3145 |       51.38% |         0.73 |      5.0000e-05 |
|      34 |         600 |       01:12:31 |       0.2949 |       51.28% |         0.70 |      5.0000e-05 |
|      34 |         610 |       01:13:46 |       0.3070 |       51.22% |         0.73 |      5.0000e-05 |
|      35 |         620 |       01:15:01 |       0.3119 |       51.49% |         0.73 |      5.0000e-05 |
|      35 |         630 |       01:16:14 |       0.2869 |       51.81% |         0.70 |      5.0000e-05 |
|      36 |         640 |       01:17:28 |       0.3401 |       51.28% |         0.78 |      5.0000e-05 |
|      37 |         650 |       01:18:40 |       0.3123 |       51.43% |         0.73 |      5.0000e-05 |
|      37 |         660 |       01:19:58 |       0.2954 |       51.27% |         0.71 |      5.0000e-05 |
|      38 |         670 |       01:21:12 |       0.2792 |       52.17% |         0.68 |      5.0000e-05 |
|      38 |         680 |       01:22:29 |       0.3225 |       51.36% |         0.76 |      5.0000e-05 |
|      39 |         690 |       01:23:41 |       0.2867 |       52.63% |         0.69 |      5.0000e-05 |
|      39 |         700 |       01:24:56 |       0.3067 |       51.52% |         0.73 |      5.0000e-05 |
|      40 |         710 |       01:26:13 |       0.2718 |       51.84% |         0.66 |      5.0000e-05 |
|      40 |         720 |       01:27:25 |       0.2888 |       52.03% |         0.70 |      5.0000e-05 |
|      41 |         730 |       01:28:42 |       0.2854 |       51.96% |         0.69 |      5.0000e-05 |
|      42 |         740 |       01:29:57 |       0.2744 |       51.18% |         0.67 |      5.0000e-05 |
|      42 |         750 |       01:31:10 |       0.2582 |       51.90% |         0.64 |      5.0000e-05 |
|      43 |         760 |       01:32:25 |       0.2586 |       52.48% |         0.64 |      5.0000e-05 |
|      43 |         770 |       01:33:35 |       0.2632 |       51.47% |         0.65 |      5.0000e-05 |
|      44 |         780 |       01:34:46 |       0.2532 |       51.58% |         0.63 |      5.0000e-05 |
|      44 |         790 |       01:36:07 |       0.2889 |       52.19% |         0.69 |      5.0000e-05 |
|      45 |         800 |       01:37:20 |       0.2551 |       52.35% |         0.63 |      5.0000e-05 |
|      45 |         810 |       01:38:27 |       0.2863 |       51.29% |         0.69 |      5.0000e-05 |
|      46 |         820 |       01:39:43 |       0.2700 |       52.58% |         0.67 |      5.0000e-05 |
|      47 |         830 |       01:40:54 |       0.3234 |       51.96% |         0.76 |      5.0000e-05 |
|      47 |         840 |       01:42:08 |       0.2819 |       52.88% |         0.69 |      5.0000e-05 |
|      48 |         850 |       01:43:23 |       0.2743 |       52.80% |         0.67 |      5.0000e-05 |
|      48 |         860 |       01:44:38 |       0.2365 |       52.21% |         0.60 |      5.0000e-05 |
|      49 |         870 |       01:45:58 |       0.2271 |       52.23% |         0.58 |      5.0000e-05 |
|      49 |         880 |       01:47:21 |       0.3006 |       52.23% |         0.72 |      5.0000e-05 |
|      50 |         890 |       01:48:35 |       0.2494 |       52.32% |         0.63 |      5.0000e-05 |
|      50 |         900 |       01:49:55 |       0.2383 |       53.51% |         0.61 |      5.0000e-05 |
|=======================================================================================================|
Detector training complete.
*************************************************************************

Inspect the properties of the detector.

detector
detector = 
  ssdObjectDetector with properties:

      ModelName: 'vehicle'
        Network: [1×1 DAGNetwork]
     ClassNames: {'vehicle'  'Background'}
    AnchorBoxes: {[5×2 double]  [5×2 double]}

You can verify the training accuracy by inspecting the training loss for each iteration.

figure
plot(info.TrainingLoss)
grid on
xlabel('Number of Iterations')
ylabel('Training Loss for Each Iteration')

Test the SSD detector on a test image.

img = imread('ssdTestDetect.png');

Run the SSD object detector on the image for vehicle detection.

[bboxes,scores] = detect(detector,img);

Display the detection results.

if(~isempty(bboxes))
    img = insertObjectAnnotation(img,'rectangle',bboxes,scores);
end
figure
imshow(img)

Input Arguments

collapse all

Labeled ground truth images, specified as a datastore or a table.

  • If you use a datastore, your data must be set up so that calling the datastore with the read and readall functions returns a cell array or table with two or three columns. When the output contains two columns, the first column must contain bounding boxes, and the second column must contain labels, {boxes,labels}. When the output contains three columns, the second column must contain the bounding boxes, and the third column must contain the labels. In this case, the first column can contain any type of data. For example, the first column can contain images or point cloud data.

    databoxeslabels
    The first column can contain data, such as point cloud data or images.The second column must be a cell array that contains M-by-5 matrices of bounding boxes of the form [xcenter, ycenter, width, height, yaw]. The vectors represent the location and size of bounding boxes for the objects in each image.The third column must be a cell array that contains M-by-1 categorical vectors containing object class names. All categorical data returned by the datastore must contain the same categories.

    For more information, see Datastores for Deep Learning (Deep Learning Toolbox).

Layer graph, specified as a LayerGraph object. The layer graph contains the architecture of the SSD multibox network. You can create this network by using the ssdLayers function or create a custom network. For more information, see Getting Started with SSD Multibox Detection.

Previously trained SSD object detector, specified as a ssdObjectDetector object. Use this syntax to continue training a detector with additional training data or to perform more training iterations to improve detector accuracy.

Training options, specified as a TrainingOptionsSGDM, TrainingOptionsRMSProp, or TrainingOptionsADAM object returned by the trainingOptions (Deep Learning Toolbox) function. To specify the solver name and other options for network training, use the trainingOptions (Deep Learning Toolbox) function.

Note

The trainSSDObjectDetector function does not support these training options:

  • Datastore inputs are not supported when you set the DispatchInBackground training option to true.

Saved detector checkpoint, specified as an ssdObjectDetector object. To save the detector after every epoch, set the 'CheckpointPath' name-value argument when using the trainingOptions function. Saving a checkpoint after every epoch is recommended because network training can take a few hours.

To load a checkpoint for a previously trained detector, load the MAT-file from the checkpoint path. For example, if the CheckpointPath property of the object specified by options is '/checkpath', you can load a checkpoint MAT-file by using this code.

data = load('/checkpath/ssd_checkpoint__216__2018_11_16__13_34_30.mat');
checkpoint = data.detector;

The name of the MAT-file includes the iteration number and timestamp of when the detector checkpoint was saved. The detector is saved in the detector variable of the file. Pass this file back into the trainSSDObjectDetector function:

ssdDetector = trainSSDObjectDetector(trainingData,checkpoint,options);

Output Arguments

collapse all

Trained SSD object detector, returned as ssdObjectDetector object. You can train a SSD object detector to detect multiple object classes.

Training progress information, returned as a structure array with eight fields. Each field corresponds to a stage of training.

  • TrainingLoss — Training loss at each iteration is the mean squared error (MSE) calculated as the sum of localization error, confidence loss, and classification loss. For more information about the training loss function, see Training Loss.

  • TrainingAccuracy — Training set accuracy at each iteration.

  • TrainingRMSE — Training root mean squared error (RMSE) is the RMSE calculated from the training loss at each iteration.

  • BaseLearnRate — Learning rate at each iteration.

  • ValidationLoss — Validation loss at each iteration.

  • ValidationAccuracy — Validation accuracy at each iteration.

  • ValidationRMSE — Validation RMSE at each iteration.

  • FinalValidationLoss — Final validation loss at end of the training.

  • FinalValidationRMSE — Final validation RMSE at end of the training.

Each field is a numeric vector with one element per training iteration. Values that have not been calculated at a specific iteration are assigned as NaN. The struct contains ValidationLoss, ValidationAccuracy, ValidationRMSE, FinalValidationLoss, and FinalValidationRMSE fields only when options specifies validation data.

References

[1] W. Liu, E. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.Fu, and A.C. Berg. "SSD: Single Shot MultiBox Detector." European Conference on Computer Vision (ECCV), Springer Verlag, 2016

Introduced in R2020a