Main Content

evaluateDetectionAOS

Evaluate average orientation similarity metric for object detection

Description

example

metrics = evaluateDetectionAOS(detectionResults,groundTruthData) computes the average orientation similarity (AOS) metric. The metric can be used to measure the detection results detectionResults against ground truth data groundTruthData. The AOS is a metric for measuring detector performance on rotated rectangle detections.

metrics = evaluateDetectionAOS(detectionResults,groundTruthData,threshold) additionally specifies the overlap threshold for assigning a detection to a ground truth bounding box.

Examples

collapse all

Define ground truth bounding boxes for a vehicle class. Each row defines a rotated bounding box of the form [xcenter, ycenter, width, height, yaw].

gtbbox = [
    2 2 10 20 45
    80 80 30 40 15
    ];

gtlabel = "vehicle";

Create a table to hold the ground truth data.

groundTruthData = table({gtbbox},'VariableNames',gtlabel)
groundTruthData=table
      vehicle   
    ____________

    {2x5 double}

Define detection results for rotated bounding boxes, scores, and labels.

bbox = [
    4 4 10 20 20
    50 50 30 10 30
    90 90 40 50 10 ];

scores = [0.9 0.7 0.8]';

labels = [
    "vehicle"
    "vehicle"
    "vehicle"
    ];
labels = categorical(labels,"vehicle");

Create a table to hold the detection results.

detectionResults = table({bbox},{scores},{labels},'VariableNames',{'Boxes','Scores','Labels'})
detectionResults=1×3 table
       Boxes           Scores            Labels      
    ____________    ____________    _________________

    {3x5 double}    {3x1 double}    {3x1 categorical}

Evaluate the detection results against ground truth by calculating the AOS metric.

metrics = evaluateDetectionAOS(detectionResults,groundTruthData)
metrics=1×5 table
                AOS        AP       OrientationSimilarity     Precision         Recall   
               ______    _______    _____________________    ____________    ____________

    vehicle    0.5199    0.54545        {4x1 double}         {4x1 double}    {4x1 double}

Input Arguments

collapse all

Detection results, specified as a three-column table. The columns contain bounding boxes, scores, and labels. The bounding boxes can be axis-aligned rectangles or rotated rectangles.

Bounding BoxFormatDescription
Axis-aligned rectangle[xmin, ymin, width, height]This type of bounding box is defined in pixel coordinates as an M-by-4 matrix representing M bounding boxes
Rotated rectangle[xcenter, ycenter, width, height, yaw]This type of bounding box is defined in spatial coordinates as an M-by-5 matrix representing M bounding boxes. The xcenter and ycenter coordinates represent the center of the bounding box. The width and height elements represent the length of the box along the x and y axes, respectively. The yaw represents the rotation angle in degrees. The amount of rotation about the center of the bounding box is measured in the clockwise direction.
   

Labeled ground truth images, specified as a datastore or a table.

  • If you use a datastore, your data must be set up so that calling the datastore with the read and readall functions returns a cell array or table with two or three columns. When the output contains two columns, the first column must contain bounding boxes, and the second column must contain labels, {boxes,labels}. When the output contains three columns, the second column must contain the bounding boxes, and the third column must contain the labels. In this case, the first column can contain any type of data. For example, the first column can contain images or point cloud data.

    databoxeslabels
    The first column can contain data, such as point cloud data or images.The second column must be a cell array that contains M-by-5 matrices of bounding boxes of the form [xcenter, ycenter, width, height, yaw]. The vectors represent the location and size of bounding boxes for the objects in each image.The third column must be a cell array that contains M-by-1 categorical vectors containing object class names. All categorical data returned by the datastore must contain the same categories.

    For more information, see Datastores for Deep Learning (Deep Learning Toolbox).

  • If you use a table, the table must have two or more columns.

    databoxes...
    The first column can contain data, such as point cloud data or images.Each of the remaining columns must be a cell vector that contains M-by-5 matrices representing rotated rectangle bounding boxes. Each rotated rectangle must be of the form[xcenter, ycenter, width, height, yaw]. The vectors represent the location and size of bounding boxes for the objects in each image. 

Overlap threshold, specified as a nonnegative scalar. The overlap ratio is defined as the intersection over union.

Output Arguments

collapse all

AOS metrics, returned as a five-column table. Each row in the table contains the evaluation metrics for a class which is defined in the ground truth data contained in the groundTruthData input. To get the object class names:

metrics.Properties.RowNames
This table describes the five columns in the metrics table.

ColumnDescription
AOSAverage orientation similarity value
APAverage precision over all the detection results, returned as a numeric scalar. Precision is a ratio of true positive instances to all positive instances of objects in the detector, based on the ground truth.
OrientationSimilarity

Orientation similarity values for each detection, returned as an M-element numeric column. M is one more than the number of detections assigned to a class. The first value of OrientationSimilarity is 1.

Orientation similarity is a normalized variant of the cosine similarity that measures the similarity between the predicted rotation angle and the ground truth rotation angle.

Precision

Precision values from each detection, returned as an M-element numeric column vector. M is one more than the number of detections assigned to a class. For example, if your detection results contain 4 detections with class label 'car', then Precision contains 5 elements. The first value of Precision is 1.

Precision is a ratio of true positive instances to all positive instances of objects in the detector, based on the ground truth.

Recall

Recall values from each detection, returned as an M-element numeric column vector. M is one more than the number of detections assigned to a class. For example, if your detection results contain 4 detections with class label 'car', then Recall contains 5 elements. The first value of Recall is 0.

Recall is a ratio of true positive instances to the sum of true positives and false negatives in the detector, based on the ground truth.

References

[1] Geiger, A., P. Lenz., and R. Urtasun. "Are we ready for autonomous driving? The KITTI vision benchmark suite." IEEE Conference on Computer Visin and Pattern Recognition. IEEE, 2012.

Introduced in R2020a