Deep Learning Processing of Live Video

The SoC Blockset™ Support Package for Xilinx^® Devices includes the RGB with DL Processor reference design. Use this reference design for deep learning (DL) applications that process live video.

This reference design feeds live HDMI video to custom preprocessing logic, a DL processor, and custom postprocessing code, and then returns the modified HDMI video output from the board. The preprocessing logic and the DL processor are on the FPGA. These two parts of the design communicate control information over an AXI manager interface, and share video data through a second AXI manager interface to DDR memory. The postprocessing logic is on the ARM^® processor and reads video data from the same memory.

This diagram shows the interfaces in the RGB with DL Processor reference design.

The FPGA user logic for this reference design must contain two simplified AXI manager protocol interfaces. One interface interacts with the DL IP core and the other transfers data between the FPGA user logic and DDR memory. The AXI manager interfaces are the same as those in the Deep Learning with Preprocessing Interface design.

AXI-Lite — The ARM and FPGA parts of the design communicate with each other by using AXI-Lite registers.
AXI4 Manager of DDR — The FPGA user logic writes output data to the PL DDR memory through this interface. The deep learning IP then reads the data for processing.
AXI4 Manager of DL IP — The FPGA user logic and the deep learning IP communicate control information over this interface. The FPGA user logic must contain logic for the handshaking protocol of the deep learning IP. The YOLO v2 Vehicle Detector with Live Camera Input on Zynq-Based Hardware (Vision HDL Toolbox) example includes a subsystem that shows how to model this handshake protocol.

In this reference design, the FPGA converts the HDMI input to an RGB pixelcontrol video stream, and converts the ARM output data back to HDMI format for output.

You can use a Video Capture HDMI block to capture the output video into Simulink^®. The video captured is the result of the postprocessing operation in the ARM processor.

To use this reference design, you must specify the name and file location of a deep learning processor core generated by using the Deep Learning HDL Toolbox™ tools. The YOLO v2 Vehicle Detector with Live Camera Input on Zynq-Based Hardware (Vision HDL Toolbox) example shows how to use this reference design, how to model the AXI interfaces and the handshaking logic between the preprocessing logic and the DL processor, and how to model the postprocessing operations.

The postprocessing operations in the YOLO v2 Vehicle Detector with Live Camera Input on Zynq-Based Hardware (Vision HDL Toolbox) example use annotation blocks that are designed for deploying to the ARM processor. When deployed, these blocks read video frames from external memory, modify the pixel values, and write the modified video frames back to memory. For example, see the Draw Rectangle and Set ROI block reference pages.

For a reference design that connects a deep learning processor with custom preprocessing logic and can be controlled from a MATLAB^® host machine, see Target Deep Learning Processor and Image Preprocessing to FPGA.

Related Examples

Deploy and Verify YOLO v2 Vehicle Detector on FPGA (Vision HDL Toolbox)
YOLO v2 Vehicle Detector with Live Camera Input on Zynq-Based Hardware (Vision HDL Toolbox)

More About

Target Deep Learning Processor and Image Preprocessing to FPGA