Main Content

Accelerate Vehicle Detection with SIMD

This example shows how to perform automatic detection and tracking of vehicles. You can generate SIMD code using Intel™ AVX2 technology to increase the number of frames per second in the video. A higher frame rate improves the quality and speed of the detection and tracking system.

Detect Vehicles Using ACF Vehicle Detector

Create an acfObjectDetector (Computer Vision Toolbox) object for detecting vehicles.

detector = vehicleDetectorACF('full-view');

To support code generation, the vehicle detector object must be in the form of a structure. Use the toStruct (Computer Vision Toolbox) function to create a structure that stores the properties of the input vehicle detector object in the Classifier and TraininginOptions fields.

sModel = toStruct(detector);

Save the structure and a detection threshold value specified as detectionThresh to a .mat file.

detectionThresh = 17;
save('model.mat','sModel', 'detectionThresh');

Examine vehicleDetection Entry-Point Function

The vehicleDetection.m file is the main entry-point function for code generation. The vehicleDetection function loads the model.mat file that you just created and recreates an acfObjectDetector object to detect vehicles within the input video. The input video file is from the Caltech lanes data set and ships with Automated Driving Toolbox™.

The vehicleDetection function uses the vision.VideoFileReader (Computer Vision Toolbox) system object to read frames from the input video and the vision.DeployableVideoPlayer (Computer Vision Toolbox) system object to display the vehicle detection video output. The function draws boxes around detected vehicles and displays the frame rate in the corner of the output video and in the MATLAB™ command window.

type vehicleDetection.m
function vehicleDetection()

model = coder.load('model.mat');
thresh = model.detectionThresh;
detector = acfObjectDetector(model.sModel.Classifier,model.sModel.TrainingOptions);

% Set up system objects to read a video file
videoFReader   = vision.VideoFileReader('caltech_cordova1.avi');

% Use deployable video player to show result
depVideoPlayer = vision.DeployableVideoPlayer;
 
totalTime=0; nFrames=0;
maxNumBoxes=30;

% Continue to read frames of video until the last frame is read 
while ~isDone(videoFReader)
    boundedBoxes=zeros(maxNumBoxes,4,'int32');
    videoFrame = videoFReader();

    tic;
    [boxes_raw, scores] = detect(detector,videoFrame);
    time = toc;
    
    % Filter out low confident detections
    boxes = boxes_raw(scores>thresh, :);

    % Draw boxes around detected vehicles in frame
    nBoxes=size(boxes,1);
    if(nBoxes>0 && nBoxes<=maxNumBoxes)
        boundedBoxes(1:nBoxes,1:4)= int32(boxes(1:nBoxes,1:4)); 
        videoFrame = insertShape(videoFrame,'Rectangle',int32(boundedBoxes));
    end    
    totalTime=totalTime+time;
    nFrames=nFrames+1;

    % Print frames per second to the frame corner
    frameRate = nFrames/totalTime;
    videoFrame = insertText(videoFrame, [20 20], sprintf('%0.2f FPS', frameRate), 'AnchorPoint', 'LeftBottom');

 
    depVideoPlayer(videoFrame);

    
end
  
% Release system objects
release(videoFReader);
release(depVideoPlayer);

% Print frames per second to command window
avgTime = totalTime/nFrames;
fprintf('Average time = %g \n', avgTime);
fprintf('Average frame rate = %g \n', 1/avgTime);
    

Configure Code Generation Configuration Object

To generate a standalone executable for the detectandTrack entry-point function, use the coder.config function to create a coder.EmbeddedCodeConfig object for an exe target. This object contains the configuration parameters that the codegen function uses for generating an executable program with Embedded Coder™.

ecfg = coder.config('exe');

Specify an example main C function that the code generator compiles to create a test executable.

ecfg.GenerateExampleMain = 'GenerateCodeAndCompile';

Optimize the build for faster running executables.

ecfg.BuildConfiguration = 'Faster Runs';

Specify that the code generator does not produce code to handle integer overflow and produces code to support nonfinite values (Inf and Nan) only if they are used.

ecfg.SaturateOnIntegerOverflow = false;
ecfg.SupportNonFinite = true;

Allocate memory dynamically on the heap for variable-size arrays whose size (in bytes) is greater than or equal to the value of the DynamicMemoryAllocationThreshold parameters.

ecfg.DynamicMemoryAllocation = 'Threshold'; 
ecfg.DynamicMemoryAllocationThreshold = 2e8;

Because this example generates code from Automated Driving Toolbox™ and Computer Vision Toolbox™ functions, the generated code must be portable and not rely on third party libraries. To generate portable code that you can retarget for an Intel device, create a coder.HardwareImplementation object and specify a nonhost target. Then, configure the production hardware settings to match those of an Intel device.

ecfg.HardwareImplementation.ProdHWDeviceType = "Generic->Custom";
ecfg.HardwareImplementation.ProdBitPerLong = 64;
ecfg.HardwareImplementation.ProdBitPerPointer = 64;
ecfg.HardwareImplementation.ProdBitPerPtrDiffT = 64;
ecfg.HardwareImplementation.ProdBitPerSizeT = 64;
ecfg.HardwareImplementation.ProdEndianess = "LittleEndian";
ecfg.HardwareImplementation.ProdIntDivRoundTo = "Zero";
ecfg.HardwareImplementation.ProdLargestAtomicFloat = "Float";
ecfg.HardwareImplementation.ProdWordSize = 64;

For some Image Processing Toolbox™ functions, the code generator uses the OpenMP application interface to support shared-memory, multicore code generation. To achieve the highest frame rate and avoid inefficiencies due to the processor trying to use too many threads, consider specifying a maximum number of threads to run parallel for-loops in the generated code. To do so, set the OpenMP environment variable, OMP_NUM_THREADS, to a number less than or equal to the number of cores in your processor. For more information, see OpenMP Specifications. This example sets this variable to 4.

setenv('OMP_NUM_THREADS','4')

Generate Non-SIMD Code

evalc('codegen -config ecfg vehicleDetection.m');

Run the executable and observe the frame rate at the top left of the video and in the command window. This example runs the executable on Windows. To run the executable on Linux, change the command to !./detectAndTrack.

!vehicleDetection.exe
Average time = 0.0763511  
Average frame rate = 13.0974  

Generate SIMD Code

Configure the code generation configuration object to generate SIMD code using AVX2 technology.

ecfg.InstructionSetExtensions = "AVX2";

Generate code.

evalc('codegen -config ecfg vehicleDetection.m');

Run the executable and observe the higher frame rate at the top left of the video and in the command window.

!vehicleDetection.exe
Average time = 0.0632882  
Average frame rate = 15.8007  

Related Topics