Deep Network Quantizer
Quantize a deep neural network to 8-bit scaled integer data types
Description
Use the Deep Network Quantizer app to reduce the memory requirement of a deep neural network by quantizing weights, biases, and activations of convolution layers to 8-bit scaled integer data types. Using this app you can:
Visualize the dynamic ranges of convolution layers in a deep neural network.
Select individual network layers to quantize.
Assess the performance of a quantized network.
Generate GPU code to deploy the quantized network using GPU Coder™.
Generate HDL code to deploy the quantized network to an FPGA using Deep Learning HDL Toolbox™.
Generate C++ code to deploy the quantized network to an ARM Cortex-A microcontroller using MATLAB® Coder™.
Generate a simulatable quantized network that you can explore in MATLAB without generating code or deploying to hardware.
This app requires Deep Learning Toolbox Model Quantization Library. To learn about the products required to quantize a deep neural network, see Quantization Workflow Prerequisites.
Open the Deep Network Quantizer App
MATLAB command prompt: Enter
deepNetworkQuantizer
.MATLAB toolstrip: On the Apps tab, under Machine Learning and Deep Learning, click the app icon.
Examples
Related Examples
Parameters
Execution Environment
— Execution Environment
GPU
(default) | FPGA
| CPU
| MATLAB
When you select New > Quantize a Network, the app allows you to choose the execution environment for the quantized network. How the network is quantized depends on the choice of execution environment.
When you select the MATLAB
execution environment, the app
performs target-agnostic quantization of the neural network. This option does not
require you to have target hardware in order to explore the quantized network in
MATLAB.
Hardware Settings
— Hardware settings
simulation environment | target
Specify hardware settings based on your execution environment.
GPU Execution Environment
Select from the following simulation environments:
Simulation Environment Action GPU
Simulate on host GPU
Deploys the quantized network to the host GPU. Validates the quantized network by comparing performance to single-precision version of the network.
MATLAB
Simulate in MATLAB
Simulates the quantized network in MATLAB. Validates the quantized network by comparing performance to single-precision version of the network.
FPGA Execution Environment
Select from the following simulation environments:
Simulation Environment Action MATLAB
Simulate in MATLAB
Simulates the quantized network in MATLAB. Validates the quantized network by comparing performance to single-precision version of the network. Intel Arria 10 SoC
arria10soc_int8
Deploys the quantized network to an Intel® Arria® 10 SoC board by using the
arria10soc_int8
bitstream. Validates the quantized network by comparing performance to single-precision version of the network.Xilinx ZCU102
zcu102_int8
Deploys the quantized network to a Xilinx® Zynq® UltraScale+™ MPSoC ZCU102 10 SoC board by using the
zcu102_int8
bitstream. Validates the quantized network by comparing performance to single-precision version of the network.Xilinx ZC706
zc706_int8
Deploys the quantized network to a Xilinx Zynq-7000 ZC706 board by using the
zc706_int8
bitstream. Validates the quantized network by comparing performance to single-precision version of the network.When you select the Intel Arria 10 SoC, Xilinx ZCU102, or Xilinx ZC706 option, additionally select the interface to use to deploy and validate the quantized network.
Target Option Action JTAG Programs the target FPGA board selected under Simulation Environment by using a JTAG cable. For more information, see JTAG Connection (Deep Learning HDL Toolbox). Ethernet Programs the target FPGA board selected in Simulation Environment through the Ethernet interface. Specify the IP address for your target board in the IP Address field. CPU Execution Environment
The Hardware Settings button is disabled. However, you must use the
raspi
function to establish a connection to your Raspberry Pi™ board prior to the Quantize and Validate step.
Quantization Options
— Options for quantization and validation
metric function | exponent scheme
By default, the Deep Network Quantizer app determines a metric function to use for the validation based on the type of network that is being quantized.
Type of Network | Metric Function |
---|---|
Classification | Top-1 Accuracy – Accuracy of the network |
Object Detection | Average Precision – Average
precision over all detection results. See |
Regression | MSE – Mean squared error of the network |
Semantic Segmentation | WeightedIOU – Average IoU of each
class, weighted by the number of pixels in that class. See |
You can also specify a custom metric function to use for validation.
You can select the exponent selection scheme to use for quantization of the network:
MinMax — (default) Evaluate the exponent based on the range information in the calibration statistics and avoid overflows.
Histogram — Distribution-based scaling which evaluates the exponent to best fit the calibration data.
Export
— Options for exporting quantized network
Export Quantized Network
| Export Quantizer
| Generate Code
Export Quantized Network — After calibrating the network, quantize and add the quantized network to the base workspace. This option exports a simulatable quantized network,
quantizedNet
, that you can explore in MATLAB without deploying to hardware. This option is equivalent to usingquantize
at the command line.Code generation is not supported for the exported quantized network,
quantizedNet
.Export Quantizer — Add the
dlquantizer
object to the base workspace. You can save thedlquantizer
object and use it for further exploration in the Deep Network Quantizer app or at the command line, or use it to generate code for your target hardware.Generate Code — Open the GPU Coder app and generate GPU code from the quantized and validated neural network. Generating GPU code requires a GPU Coder license.