yolov2ReorgLayer

Create reorganization layer for YOLO v2 object detection network

Description

The yolov2ReorgLayer function creates a YOLOv2ReorgLayer object, which represents the reorganization layer for you look only once version 2 (YOLO v2) object detection network. The reorganization layer reorganizes the high-resolution feature maps from a lower layer by stacking adjacent features into different channels. The output of reorganization layer is fed to the depth concatenation layer. The depth concatenation layer concatenates the reorganized high-resolution features with the low-resolution features from a higher layer.

Creation

Syntax

layer = yolov2ReorgLayer(stride)
layer = yolov2ReorgLayer(stride,Name,Value)

Description

example

layer = yolov2ReorgLayer(stride) creates the reorganization layer for YOLO v2 object detection network. The layer reorganizes the dimension of the input feature maps according to the step size specified in stride. For details on creating a YOLO v2 network with reorganization layer, see Design a YOLO v2 Detection Network with a Reorg Layer.

example

layer = yolov2ReorgLayer(stride,Name,Value) sets the Name property using a name-value pair. Enclose the property name in single quotes. For example, yolov2ReorgLayer('Name','yolo_Reorg') creates reorganization layer with the name 'yolo_Reorg'.

Input Arguments

expand all

Step size for traversing the input vertically and horizontally, specified as a 2- element vector of positive integers in form [a b]. a is the vertical step size and b is the horizontal step size.

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

Properties

expand all

Layer name, specified as a character vector or a string scalar. To include a layer in a layer graph, you must specify a nonempty unique layer name. If you train a series network with the layer and Name is set to '', then the software automatically assigns a name to the layer at training time.

Data Types: char | string

Number of inputs of the layer. This layer accepts a single input only.

Data Types: double

Input names of the layer. This layer accepts a single input only.

Data Types: cell

Number of outputs of the layer. This layer has a single output only.

Data Types: double

Output names of the layer. This layer has a single output only.

Data Types: cell

Examples

collapse all

Specify the step size for reorganising the dimension of input feature map.

stride = [2 2];

Create a YOLO v2 reorganization layer with the specified step size and the name as "yolo_Reorg".

layer = yolov2ReorgLayer(stride,'Name','yolo_Reorg');

Inspect the properties of the YOLO v2 reorganization layer.

layer
layer = 
  YOLOv2ReorgLayer with properties:

      Name: 'yolo_Reorg'

   Hyperparameters
    Stride: [2 2]

Tips

  • You can find the desired value of stride using:

Algorithms

The reorganization layer improves the performance of the YOLO v2 object detection network by facilitating feature concatenation from different layers. It reorganizes the dimension of a lower layer feature map so that it can be concatenated with the higher layer feature map.

Consider an input feature map of size [H W C], where:

  • H is the height of the feature map.

  • W is the width of the feature map.

  • C is the number of channels.

The reorganization layer chooses feature map values from locations based on the step sizes in stride and adds those feature values to the third dimension C. The size of the reorganized feature map from the reorganization layer is [floor(H/stride(1)) floor(W/stride(2)) C×stride(1)×stride(2)].

For feature concatenation, the height and width of the reorganized feature map must match with the height and width of the higher layer feature map.

References

[1] Joseph. R, S. K. Divvala, R. B. Girshick, and F. Ali. "You Only Look Once: Unified, Real-Time Object Detection." In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 779–788. Las Vegas, NV: CVPR, 2016.

[2] Joseph. R and F. Ali. "YOLO 9000: Better, Faster, Stronger." In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6517–6525. Honolulu, HI: CVPR, 2017.

Extended Capabilities

Introduced in R2019a