validate
Quantize and validate a deep neural network
Syntax
Description
quantizes the weights, biases, and activations in the convolution layers of the network, and
validates the network specified by validationResults
= validate(quantObj
, valData
)dlquantizer
object,
quantObj
and using the data specified by
valData
.
quantizes the weights, biases, and activations in the convolution layers of the network, and
validates the network specified by validationResults
= validate(quantObj
, valData
, quantOpts
)dlquantizer
object,
quantObj
, using the data specified by valData
,
and the optional argument quantOpts
that specifies a metric function to
evaluate the performance of the quantized network.
To learn about the products required to quantize a deep neural network, see Quantization Workflow Prerequisites.
Examples
Quantize a Neural Network for GPU Target
This example shows how to quantize learnable parameters in the convolution layers of a neural network for GPU and explore the behavior of the quantized network. In this example, you quantize the squeezenet neural network after retraining the network to classify new images according to the Train Deep Learning Network to Classify New Images example. In this example, the memory required for the network is reduced approximately 75% through quantization while the accuracy of the network is not affected.
Load the pretrained network. net
is the output network of the Train Deep Learning Network to Classify New Images example.
load squeezenetmerch
net
net = DAGNetwork with properties: Layers: [68×1 nnet.cnn.layer.Layer] Connections: [75×2 table] InputNames: {'data'} OutputNames: {'new_classoutput'}
Define calibration and validation data to use for quantization.
The calibration data is used to collect the dynamic ranges of the weights and biases in the convolution and fully connected layers of the network and the dynamic ranges of the activations in all layers of the network. For the best quantization results, the calibration data must be representative of inputs to the network.
The validation data is used to test the network after quantization to understand the effects of the limited range and precision of the quantized convolution layers in the network.
In this example, use the images in the MerchData
data set. Define an augmentedImageDatastore
object to resize the data for the network. Then, split the data into calibration and validation data sets.
unzip('MerchData.zip'); imds = imageDatastore('MerchData', ... 'IncludeSubfolders',true, ... 'LabelSource','foldernames'); [calData, valData] = splitEachLabel(imds, 0.7, 'randomized'); aug_calData = augmentedImageDatastore([227 227], calData); aug_valData = augmentedImageDatastore([227 227], valData);
Create a dlquantizer
object and specify the network to quantize.
quantObj = dlquantizer(net);
Define a metric function to use to compare the behavior of the network before and after quantization. This example uses the hComputeModelAccuracy
metric function.
function accuracy = hComputeModelAccuracy(predictionScores, net, dataStore) %% Computes model-level accuracy statistics % Load ground truth tmp = readall(dataStore); groundTruth = tmp.response; % Compare with predicted label with actual ground truth predictionError = {}; for idx=1:numel(groundTruth) [~, idy] = max(predictionScores(idx,:)); yActual = net.Layers(end).Classes(idy); predictionError{end+1} = (yActual == groundTruth(idx)); %#ok end % Sum all prediction errors. predictionError = [predictionError{:}]; accuracy = sum(predictionError)/numel(predictionError); end
Specify the metric function in a dlquantizationOptions
object.
quantOpts = dlquantizationOptions('MetricFcn',{@(x)hComputeModelAccuracy(x, net, aug_valData)});
Use the calibrate
function to exercise the network with sample inputs and collect range information. The calibrate
function exercises the network and collects the dynamic ranges of the weights and biases in the convolution and fully connected layers of the network and the dynamic ranges of the activations in all layers of the network. The function returns a table. Each row of the table contains range information for a learnable parameter of the optimized network.
calResults = calibrate(quantObj, aug_calData)
calResults=121×5 table
Optimized Layer Name Network Layer Name Learnables / Activations MinValue MaxValue
____________________________ ____________________ ________________________ _________ ________
{'conv1_Weights' } {'conv1' } "Weights" -0.91985 0.88489
{'conv1_Bias' } {'conv1' } "Bias" -0.07925 0.26343
{'fire2-squeeze1x1_Weights'} {'fire2-squeeze1x1'} "Weights" -1.38 1.2477
{'fire2-squeeze1x1_Bias' } {'fire2-squeeze1x1'} "Bias" -0.11641 0.24273
{'fire2-expand1x1_Weights' } {'fire2-expand1x1' } "Weights" -0.7406 0.90982
{'fire2-expand1x1_Bias' } {'fire2-expand1x1' } "Bias" -0.060056 0.14602
{'fire2-expand3x3_Weights' } {'fire2-expand3x3' } "Weights" -0.74397 0.66905
{'fire2-expand3x3_Bias' } {'fire2-expand3x3' } "Bias" -0.051778 0.074239
{'fire3-squeeze1x1_Weights'} {'fire3-squeeze1x1'} "Weights" -0.7712 0.68917
{'fire3-squeeze1x1_Bias' } {'fire3-squeeze1x1'} "Bias" -0.10138 0.32675
{'fire3-expand1x1_Weights' } {'fire3-expand1x1' } "Weights" -0.72035 0.9743
{'fire3-expand1x1_Bias' } {'fire3-expand1x1' } "Bias" -0.067029 0.30425
{'fire3-expand3x3_Weights' } {'fire3-expand3x3' } "Weights" -0.61443 0.7741
{'fire3-expand3x3_Bias' } {'fire3-expand3x3' } "Bias" -0.053613 0.10329
{'fire4-squeeze1x1_Weights'} {'fire4-squeeze1x1'} "Weights" -0.7422 1.0877
{'fire4-squeeze1x1_Bias' } {'fire4-squeeze1x1'} "Bias" -0.10885 0.13881
⋮
Use the validate
function to quantize the learnable parameters in the convolution layers of the network and exercise the network. The function uses the metric function defined in the dlquantizationOptions
object to compare the results of the network before and after quantization.
valResults = validate(quantObj, aug_valData, quantOpts)
valResults = struct with fields:
NumSamples: 20
MetricResults: [1×1 struct]
Statistics: [2×2 table]
Examine the validation output to see the performance of the quantized network.
valResults.MetricResults.Result
ans=2×2 table
NetworkImplementation MetricOutput
_____________________ ____________
{'Floating-Point'} 1
{'Quantized' } 1
valResults.Statistics
ans=2×2 table
NetworkImplementation LearnableParameterMemory(bytes)
_____________________ _______________________________
{'Floating-Point'} 2.9003e+06
{'Quantized' } 7.3393e+05
In this example, the memory required for the network was reduced approximately 75% through quantization. The accuracy of the network is not affected.
The weights, biases, and activations of the convolution layers of the network specified in the dlquantizer object now use scaled 8-bit integer data types.
Quantize a Neural Network for FPGA Target
This example shows how to quantize learnable parameters in the
convolution layers of a neural network, and explore the behavior of the quantized
network. In this example, you quantize the LogoNet
neural network.
Quantization helps reduce the memory requirement of a deep neural network by quantizing
weights, biases and activations of network layers to 8-bit scaled integer data types.
Use MATLAB® to retrieve the prediction results from the target device.
To run this example, you need the products listed under FPGA
in
Quantization Workflow Prerequisites.
For additional requirements, see Quantization Workflow Prerequisites.
Create a file in your current working directory called
getLogoNetwork.m
. Enter these lines into the file:
function net = getLogoNetwork() data = getLogoData(); net = data.convnet; end function data = getLogoData() if ~isfile('LogoNet.mat') url = 'https://www.mathworks.com/supportfiles/gpucoder/cnn_models/logo_detection/LogoNet.mat'; websave('LogoNet.mat',url); end data = load('LogoNet.mat'); end
Load the pretrained network.
snet = getLogoNetwork();
snet = SeriesNetwork with properties: Layers: [22×1 nnet.cnn.layer.Layer] InputNames: {'imageinput'} OutputNames: {'classoutput'}
Define calibration and validation data to use for quantization.
The calibration data is used to collect the dynamic ranges of the weights and biases in the convolution and fully connected layers of the network and the dynamic ranges of the activations in all layers of the network. For the best quantization results, the calibration data must be representative of inputs to the network.
The validation data is used to test the network after quantization to understand the effects of the limited range and precision of the quantized convolution layers in the network.
This example uses the images in the logos_dataset
data set.
Define an augmentedImageDatastore
object to resize the data for the
network. Then, split the data into calibration and validation data sets.
curDir = pwd; newDir = fullfile(matlabroot,'examples','deeplearning_shared','data','logos_dataset.zip'); copyfile(newDir,curDir); unzip('logos_dataset.zip'); imageData = imageDatastore(fullfile(curDir,'logos_dataset'),... 'IncludeSubfolders',true,'FileExtensions','.JPG','LabelSource','foldernames'); [calibrationData, validationData] = splitEachLabel(imageData, 0.5,'randomized');
Create a dlquantizer
object and specify the network to
quantize.
dlQuantObj = dlquantizer(snet,'ExecutionEnvironment','FPGA');
Use the calibrate
function to exercise the network with sample
inputs and collect range information. The calibrate
function
exercises the network and collects the dynamic ranges of the weights and biases in
the convolution and fully connected layers of the network and the dynamic ranges of
the activations in all layers of the network. The function returns a table. Each row
of the table contains range information for a learnable parameter of the optimized
network.
dlQuantObj.calibrate(calibrationData)
ans = Optimized Layer Name Network Layer Name Learnables / Activations MinValue MaxValue ____________________________ __________________ ________________________ ___________ __________ {'conv_1_Weights' } {'conv_1' } "Weights" -0.048978 0.039352 {'conv_1_Bias' } {'conv_1' } "Bias" 0.99996 1.0028 {'conv_2_Weights' } {'conv_2' } "Weights" -0.055518 0.061901 {'conv_2_Bias' } {'conv_2' } "Bias" -0.00061171 0.00227 {'conv_3_Weights' } {'conv_3' } "Weights" -0.045942 0.046927 {'conv_3_Bias' } {'conv_3' } "Bias" -0.0013998 0.0015218 {'conv_4_Weights' } {'conv_4' } "Weights" -0.045967 0.051 {'conv_4_Bias' } {'conv_4' } "Bias" -0.00164 0.0037892 {'fc_1_Weights' } {'fc_1' } "Weights" -0.051394 0.054344 {'fc_1_Bias' } {'fc_1' } "Bias" -0.00052319 0.00084454 {'fc_2_Weights' } {'fc_2' } "Weights" -0.05016 0.051557 {'fc_2_Bias' } {'fc_2' } "Bias" -0.0017564 0.0018502 {'fc_3_Weights' } {'fc_3' } "Weights" -0.050706 0.04678 {'fc_3_Bias' } {'fc_3' } "Bias" -0.02951 0.024855 {'imageinput' } {'imageinput'} "Activations" 0 255 {'imageinput_normalization'} {'imageinput'} "Activations" -139.34 198.72
Create a target object with a custom name for your target device and an interface to connect your target device to the host computer. Interface options are JTAG and Ethernet. To create the target object, enter:
hTarget = dlhdl.Target('Intel', 'Interface', 'JTAG');
Define a metric function to use to compare the behavior of the network before and after quantization. Save this function in a local file.
function accuracy = hComputeModelAccuracy(predictionScores, net, dataStore) %% hComputeModelAccuracy test helper function computes model level accuracy statistics % Copyright 2020 The MathWorks, Inc. % Load ground truth groundTruth = dataStore.Labels; % Compare predicted label with ground truth predictionError = {}; for idx=1:numel(groundTruth) [~, idy] = max(predictionScores(idx, :)); yActual = net.Layers(end).Classes(idy); predictionError{end+1} = (yActual == groundTruth(idx)); %#ok end % Sum all prediction errors. predictionError = [predictionError{:}]; accuracy = sum(predictionError)/numel(predictionError); end
Specify the metric function in a dlquantizationOptions
object.
options = dlquantizationOptions('MetricFcn', ... {@(x)hComputeModelAccuracy(x, snet, validationData)},'Bitstream','arria10soc_int8',... 'Target',hTarget);
To compile and deploy the quantized network, run the validate
function of the dlquantizer
object. Use the
validate
function to quantize the learnable parameters in the
convolution layers of the network and exercise the network. This function uses the
output of the compile function to program the FPGA board by using the programming
file. It also downloads the network weights and biases. The deploy function checks
for the Intel Quartus tool and the supported tool version. It then starts
programming the FPGA device by using the sof file, displays progress messages, and
the time it takes to deploy the network. The function uses the metric function
defined in the dlquantizationOptions
object to compare the results
of the network before and after quantization.
prediction = dlQuantObj.validate(validationData,options);
offset_name offset_address allocated_space _______________________ ______________ _________________ "InputDataOffset" "0x00000000" "48.0 MB" "OutputResultOffset" "0x03000000" "4.0 MB" "SystemBufferOffset" "0x03400000" "60.0 MB" "InstructionDataOffset" "0x07000000" "8.0 MB" "ConvWeightDataOffset" "0x07800000" "8.0 MB" "FCWeightDataOffset" "0x08000000" "12.0 MB" "EndOffset" "0x08c00000" "Total: 140.0 MB" ### Programming FPGA Bitstream using JTAG... ### Programming the FPGA bitstream has been completed successfully. ### Loading weights to Conv Processor. ### Conv Weights loaded. Current time is 16-Jul-2020 12:45:10 ### Loading weights to FC Processor. ### FC Weights loaded. Current time is 16-Jul-2020 12:45:26 ### Finished writing input activations. ### Running single input activations. Deep Learning Processor Profiler Performance Results LastLayerLatency(cycles) LastLayerLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 13570959 0.09047 30 380609145 11.8 conv_module 12667786 0.08445 conv_1 3938907 0.02626 maxpool_1 1544560 0.01030 conv_2 2910954 0.01941 maxpool_2 577524 0.00385 conv_3 2552707 0.01702 maxpool_3 676542 0.00451 conv_4 455434 0.00304 maxpool_4 11251 0.00008 fc_module 903173 0.00602 fc_1 536164 0.00357 fc_2 342643 0.00228 fc_3 24364 0.00016 * The clock frequency of the DL processor is: 150MHz ### Finished writing input activations. ### Running single input activations. Deep Learning Processor Profiler Performance Results LastLayerLatency(cycles) LastLayerLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 13570364 0.09047 30 380612682 11.8 conv_module 12667103 0.08445 conv_1 3939296 0.02626 maxpool_1 1544371 0.01030 conv_2 2910747 0.01940 maxpool_2 577654 0.00385 conv_3 2551829 0.01701 maxpool_3 676548 0.00451 conv_4 455396 0.00304 maxpool_4 11355 0.00008 fc_module 903261 0.00602 fc_1 536206 0.00357 fc_2 342688 0.00228 fc_3 24365 0.00016 * The clock frequency of the DL processor is: 150MHz ### Finished writing input activations. ### Running single input activations. Deep Learning Processor Profiler Performance Results LastLayerLatency(cycles) LastLayerLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 13571561 0.09048 30 380608338 11.8 conv_module 12668340 0.08446 conv_1 3939070 0.02626 maxpool_1 1545327 0.01030 conv_2 2911061 0.01941 maxpool_2 577557 0.00385 conv_3 2552082 0.01701 maxpool_3 676506 0.00451 conv_4 455582 0.00304 maxpool_4 11248 0.00007 fc_module 903221 0.00602 fc_1 536167 0.00357 fc_2 342643 0.00228 fc_3 24409 0.00016 * The clock frequency of the DL processor is: 150MHz ### Finished writing input activations. ### Running single input activations. Deep Learning Processor Profiler Performance Results LastLayerLatency(cycles) LastLayerLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 13569862 0.09047 30 380613327 11.8 conv_module 12666756 0.08445 conv_1 3939212 0.02626 maxpool_1 1543267 0.01029 conv_2 2911184 0.01941 maxpool_2 577275 0.00385 conv_3 2552868 0.01702 maxpool_3 676438 0.00451 conv_4 455353 0.00304 maxpool_4 11252 0.00008 fc_module 903106 0.00602 fc_1 536050 0.00357 fc_2 342645 0.00228 fc_3 24409 0.00016 * The clock frequency of the DL processor is: 150MHz ### Finished writing input activations. ### Running single input activations. Deep Learning Processor Profiler Performance Results LastLayerLatency(cycles) LastLayerLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 13570823 0.09047 30 380619836 11.8 conv_module 12667607 0.08445 conv_1 3939074 0.02626 maxpool_1 1544519 0.01030 conv_2 2910636 0.01940 maxpool_2 577769 0.00385 conv_3 2551800 0.01701 maxpool_3 676795 0.00451 conv_4 455859 0.00304 maxpool_4 11248 0.00007 fc_module 903216 0.00602 fc_1 536165 0.00357 fc_2 342643 0.00228 fc_3 24406 0.00016 * The clock frequency of the DL processor is: 150MHz offset_name offset_address allocated_space _______________________ ______________ _________________ "InputDataOffset" "0x00000000" "48.0 MB" "OutputResultOffset" "0x03000000" "4.0 MB" "SystemBufferOffset" "0x03400000" "60.0 MB" "InstructionDataOffset" "0x07000000" "8.0 MB" "ConvWeightDataOffset" "0x07800000" "8.0 MB" "FCWeightDataOffset" "0x08000000" "12.0 MB" "EndOffset" "0x08c00000" "Total: 140.0 MB" ### FPGA bitstream programming has been skipped as the same bitstream is already loaded on the target FPGA. ### Deep learning network programming has been skipped as the same network is already loaded on the target FPGA. ### Finished writing input activations. ### Running single input activations. Deep Learning Processor Profiler Performance Results LastLayerLatency(cycles) LastLayerLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 13572329 0.09048 10 127265075 11.8 conv_module 12669135 0.08446 conv_1 3939559 0.02626 maxpool_1 1545378 0.01030 conv_2 2911243 0.01941 maxpool_2 577422 0.00385 conv_3 2552064 0.01701 maxpool_3 676678 0.00451 conv_4 455657 0.00304 maxpool_4 11227 0.00007 fc_module 903194 0.00602 fc_1 536140 0.00357 fc_2 342688 0.00228 fc_3 24364 0.00016 * The clock frequency of the DL processor is: 150MHz ### Finished writing input activations. ### Running single input activations. Deep Learning Processor Profiler Performance Results LastLayerLatency(cycles) LastLayerLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 13572527 0.09048 10 127266427 11.8 conv_module 12669266 0.08446 conv_1 3939776 0.02627 maxpool_1 1545632 0.01030 conv_2 2911169 0.01941 maxpool_2 577592 0.00385 conv_3 2551613 0.01701 maxpool_3 676811 0.00451 conv_4 455418 0.00304 maxpool_4 11348 0.00008 fc_module 903261 0.00602 fc_1 536205 0.00357 fc_2 342689 0.00228 fc_3 24365 0.00016 * The clock frequency of the DL processor is: 150MHz
Examine the MetricResults.Result
field of the validation output
to see the performance of the quantized network.
validateOut = prediction.MetricResults.Result
ans = NetworkImplementation MetricOutput _____________________ ____________ {'Floating-Point'} 0.9875 {'Quantized' } 0.9875
Examine the QuantizedNetworkFPS
field of the validation output
to see the frames per second performance of the quantized network.
prediction.QuantizedNetworkFPS
ans = 11.8126
The weights, biases, and activations of the convolution layers of the network
specified in the dlquantizer
object now use scaled 8-bit integer
data types.
Validate Quantized Network by Using MATLAB Simulation
This example shows how to quantize learnable parameters in the convolution layers of a neural network, and validate the quantized network. Rapidly prototype the quantized network by using MATLAB based simulation to validate the quantized network. For this type of simulation, you do not need hardware FPGA board from the prototyping process. In this example, you quantize the LogoNet neural network.
For this example, you need the products listed under FPGA
in Quantization Workflow Prerequisites.
Load the pretrained network and analyze the network architecture.
snet = getLogoNetwork; analyzeNetwork(snet);

Define calibration and validation data to use for quantization.
This example uses the logos_dataset
data set. The data set
consists of 320 images. Each image is 227-by-227 in size and has three color channels
(RGB). Create an augmentedImageDatastore
object to use for
calibration and validation. Expedite the calibration and validation process by reducing
the calibration data set to 20 images. The MATLAB simulation workflow has a maximum
limit of five images when validating the quantized network. Reduce the validation data
set sizes to five images.
curDir = pwd; newDir = fullfile(matlabroot,'examples','deeplearning_shared','data','logos_dataset.zip'); copyfile(newDir,curDir,'f'); unzip('logos_dataset.zip'); imageData = imageDatastore(fullfile(curDir,'logos_dataset'),... 'IncludeSubfolders',true,'FileExtensions','.JPG','LabelSource','foldernames'); [calibrationData, validationData] = splitEachLabel(imageData, 0.5,'randomized'); calibrationData_reduced = calibrationData.subset(1:20); validationData_reduced = validationData.subset(1:5);
Create a quantized network by using the dlquantizer
object. To use
the MATLAB simulation environment set Simulation to on.
dlQuantObj = dlquantizer(snet,'ExecutionEnvironment','FPGA','Simulation','on')
Use the calibrate
function to exercise the network with sample
inputs and collect the range information. The calibrate
function
exercises the network and collects the dynamic ranges of the weights and biases in the
convolution and fully connected layers of the network and the dynamic ranges of the
activations in all layers of the network. The calibrate function returns a table. Each
row of the table contains range information for a learnable parameter of the quantized
network.
dlQuantObj.calibrate(calibrationData_reduced)
ans = 35×5 table Optimized Layer Name Network Layer Name Learnables / Activations MinValue MaxValue ____________________________ __________________ ________________________ ___________ __________ {'conv_1_Weights' } {'conv_1' } "Weights" -0.048978 0.039352 {'conv_1_Bias' } {'conv_1' } "Bias" 0.99996 1.0028 {'conv_2_Weights' } {'conv_2' } "Weights" -0.055518 0.061901 {'conv_2_Bias' } {'conv_2' } "Bias" -0.00061171 0.00227 {'conv_3_Weights' } {'conv_3' } "Weights" -0.045942 0.046927 {'conv_3_Bias' } {'conv_3' } "Bias" -0.0013998 0.0015218 {'conv_4_Weights' } {'conv_4' } "Weights" -0.045967 0.051 {'conv_4_Bias' } {'conv_4' } "Bias" -0.00164 0.0037892 {'fc_1_Weights' } {'fc_1' } "Weights" -0.051394 0.054344 {'fc_1_Bias' } {'fc_1' } "Bias" -0.00052319 0.00084454 {'fc_2_Weights' } {'fc_2' } "Weights" -0.05016 0.051557 {'fc_2_Bias' } {'fc_2' } "Bias" -0.0017564 0.0018502 {'fc_3_Weights' } {'fc_3' } "Weights" -0.050706 0.04678 {'fc_3_Bias' } {'fc_3' } "Bias" -0.02951 0.024855 {'imageinput' } {'imageinput' } "Activations" 0 255 {'imageinput_normalization'} {'imageinput' } "Activations" -139.34 198.11 {'conv_1' } {'conv_1' } "Activations" -431.01 290.14 {'relu_1' } {'relu_1' } "Activations" 0 290.14 {'maxpool_1' } {'maxpool_1' } "Activations" 0 290.14 {'conv_2' } {'conv_2' } "Activations" -166.41 466.4 {'relu_2' } {'relu_2' } "Activations" 0 466.4 {'maxpool_2' } {'maxpool_2' } "Activations" 0 466.4 {'conv_3' } {'conv_3' } "Activations" -219.6 300.65 {'relu_3' } {'relu_3' } "Activations" 0 300.65 {'maxpool_3' } {'maxpool_3' } "Activations" 0 299.73 {'conv_4' } {'conv_4' } "Activations" -245.37 209.11 {'relu_4' } {'relu_4' } "Activations" 0 209.11 {'maxpool_4' } {'maxpool_4' } "Activations" 0 209.11 {'fc_1' } {'fc_1' } "Activations" -123.79 77.114 {'relu_5' } {'relu_5' } "Activations" 0 77.114 {'fc_2' } {'fc_2' } "Activations" -16.557 17.512 {'relu_6' } {'relu_6' } "Activations" 0 17.512 {'fc_3' } {'fc_3' } "Activations" -13.049 37.204 {'softmax' } {'softmax' } "Activations" 1.4971e-22 1 {'classoutput' } {'classoutput'} "Activations" 1.4971e-22 1
Set your target metric function and create a
dlquantizationOptions
object with the target metric function and
the validation data set. In this example the target metric function calculates the Top-5
accuracy.
options = dlquantizationOptions('MetricFcn', {@(x)hComputeAccuracy(x,snet,validationData_reduced)});
Note
If no custom metric function is specified, the default metric function will be used for validation. The default metric function uses at most 5 files from the validation datastore when the MATLAB® simulation environment is selected. Custom metric functions do not have this restriction.
Use the validate
function to quantize the learnable parameters
in the convolution layers of the network. The validate
function
simulates the quantized network in MATLAB. The validate
function
uses the metric function defined in the dlquantizationOptions
object to compare the results of the single data type network object to the results of
the quantized network object.
prediction = dlQuantObj.validate(validationData_reduced,options)
### Notice: (Layer 1) The layer 'imageinput' with type 'nnet.cnn.layer.ImageInputLayer' is implemented in software. ### Notice: (Layer 2) The layer 'out_imageinput' with type 'nnet.cnn.layer.RegressionOutputLayer' is implemented in software. Compiling leg: conv_1>>maxpool_4 ... ### Notice: (Layer 1) The layer 'imageinput' with type 'nnet.cnn.layer.ImageInputLayer' is implemented in software. ### Notice: (Layer 14) The layer 'output' with type 'nnet.cnn.layer.RegressionOutputLayer' is implemented in software. Compiling leg: conv_1>>maxpool_4 ... complete. Compiling leg: fc_1>>fc_3 ... ### Notice: (Layer 1) The layer 'maxpool_4' with type 'nnet.cnn.layer.ImageInputLayer' is implemented in software. ### Notice: (Layer 7) The layer 'output' with type 'nnet.cnn.layer.RegressionOutputLayer' is implemented in software. Compiling leg: fc_1>>fc_3 ... complete. ### Should not enter here. It means a component is unaccounted for in MATLAB Emulation. ### Notice: (Layer 1) The layer 'fc_3' with type 'nnet.cnn.layer.ImageInputLayer' is implemented in software. ### Notice: (Layer 2) The layer 'softmax' with type 'nnet.cnn.layer.SoftmaxLayer' is implemented in software. ### Notice: (Layer 3) The layer 'classoutput' with type 'nnet.cnn.layer.ClassificationOutputLayer' is implemented in software. prediction = struct with fields: NumSamples: 5 MetricResults: [1×1 struct]
Examine the MetricResults.Result
field of the validation output
to see the performance of the quantized network.
validateOut = prediction.MetricResults.Result
validateOut = 2×2 table NetworkImplementation MetricOutput _____________________ ____________ {'Floating-Point'} 1 {'Quantized' } 1
Validate Quantized Neural Network for CPU Target
This example uses:
- MATLAB CoderMATLAB Coder
- Deep Learning ToolboxDeep Learning Toolbox
- MATLAB Support Package for Raspberry Pi HardwareMATLAB Support Package for Raspberry Pi Hardware
- Embedded CoderEmbedded Coder
- MATLAB Coder Interface for Deep Learning LibrariesMATLAB Coder Interface for Deep Learning Libraries
This example shows how to quantize and validate a neural network for a CPU target. This workflow is similar to other execution environments, but before validating you must establish a raspi
connection.
First, load your network. This example uses the pretrained network squeezeNet
.
load squeezenetmerch
net
net = DAGNetwork with properties: Layers: [68×1 nnet.cnn.layer.Layer] Connections: [75×2 table] InputNames: {'data'} OutputNames: {'new_classoutput'}
Then define your calibration and validation data, calDS
and valDS
respectively.
unzip('MerchData.zip'); imds = imageDatastore('MerchData', ... 'IncludeSubfolders',true, ... 'LabelSource','foldernames'); [calData, valData] = splitEachLabel(imds, 0.7, 'randomized'); aug_calData = augmentedImageDatastore([227 227], calData); aug_valData = augmentedImageDatastore([227 227], valData);
Create the dlquantizer object.
dq = dlquantizer(net,'ExecutionEnvironment','CPU')
dq = dlquantizer with properties: NetworkObject: [1×1 DAGNetwork] ExecutionEnvironment: 'CPU'
Calibrate the network.
calResults = calibrate(dq, aug_calData)
calResults=121×5 table
Optimized Layer Name Network Layer Name Learnables / Activations MinValue MaxValue
____________________________ ____________________ ________________________ _________ ________
{'conv1_Weights' } {'conv1' } "Weights" -0.91985 0.88489
{'conv1_Bias' } {'conv1' } "Bias" -0.07925 0.26343
{'fire2-squeeze1x1_Weights'} {'fire2-squeeze1x1'} "Weights" -1.38 1.2477
{'fire2-squeeze1x1_Bias' } {'fire2-squeeze1x1'} "Bias" -0.11641 0.24273
{'fire2-expand1x1_Weights' } {'fire2-expand1x1' } "Weights" -0.7406 0.90982
{'fire2-expand1x1_Bias' } {'fire2-expand1x1' } "Bias" -0.060056 0.14602
{'fire2-expand3x3_Weights' } {'fire2-expand3x3' } "Weights" -0.74397 0.66905
{'fire2-expand3x3_Bias' } {'fire2-expand3x3' } "Bias" -0.051778 0.074239
{'fire3-squeeze1x1_Weights'} {'fire3-squeeze1x1'} "Weights" -0.7712 0.68917
{'fire3-squeeze1x1_Bias' } {'fire3-squeeze1x1'} "Bias" -0.10138 0.32675
{'fire3-expand1x1_Weights' } {'fire3-expand1x1' } "Weights" -0.72035 0.9743
{'fire3-expand1x1_Bias' } {'fire3-expand1x1' } "Bias" -0.067029 0.30425
{'fire3-expand3x3_Weights' } {'fire3-expand3x3' } "Weights" -0.61443 0.7741
{'fire3-expand3x3_Bias' } {'fire3-expand3x3' } "Bias" -0.053613 0.10329
{'fire4-squeeze1x1_Weights'} {'fire4-squeeze1x1'} "Weights" -0.7422 1.0877
{'fire4-squeeze1x1_Bias' } {'fire4-squeeze1x1'} "Bias" -0.10885 0.13881
⋮
Establish raspi
connection. This step is unique to CPU targets.
If you are connecting to your Raspberry Pi hardware board for the first time, you must also specify the Username
, DeviceAddress
, and Password
.
r = raspi;
% r = raspi('raspiname','username','password');
Use the MATLAB Support Package for Raspberry Pi function, raspi
, to create a connection to the Raspberry Pi. In the above code, replace:
raspiname
with the name or address of your Raspberry Piusername
with your user namepassword
with your password
For more information about raspi
, see raspi.
Validate the quantized network with the validate
function.
valResults = validate(dq, aug_valData)
### Starting application: 'codegen\lib\validate_predict_int8\pil\validate_predict_int8.elf' To terminate execution: clear validate_predict_int8_pil ### Launching application validate_predict_int8.elf... ### Host application produced the following standard output (stdout) and standard error (stderr) messages:
valResults = struct with fields:
NumSamples: 20
MetricResults: [1×1 struct]
Statistics: []
Examine the validation output to see the performance of the quantized network.
valResults.MetricResults.Result
ans=2×2 table
NetworkImplementation MetricOutput
_____________________ ____________
{'Floating-Point'} 1
{'Quantized' } 1
Input Arguments
quantObj
— Network to quantize
dlquantizer
object
dlquantizer
object specifying the network to quantize.
valData
— Data to use for validation of quantized network
imageDataStore
object | augmentedImageDataStore
object | pixelLabelImageDataStore
object
Data to use for validation of quantized network, specified as an
imageDataStore
object, an augmentedImageDataStore
object, or a pixelLabelImageDataStore
object.
quantOpts
— Options for quantizing network
dlQuantizationOptions
object
Options for quantizing the network, specified as a dlquantizationOptions
object.
Output Arguments
validationResults
— Results of quantization of network
struct
Results of quantization of the network, returned as a struct. The struct contains these fields.
NumSamples
– The number of sample inputs used to validate the network.MetricResults
– Struct containing results of the metric function defined in thedlquantizationOptions
object. When more than one metric function is specified in thedlquantizationOptions
object,MetricResults
is an array of structs.MetricResults
contains these fields.
Field | Description |
---|---|
MetricFunction | Function used to determine the performance of the quantized network. This
function is specified in the dlquantizationOptions object. |
Result | Table indicating the results of the metric function before and after quantization. The first row in the table contains the
information for the original, floating-point implementation. The second row
contains the information for the quantized implementation. The output of the
metric function is displayed in the |
Version History
See Also
Apps
Functions
Abrir ejemplo
Tiene una versión modificada de este ejemplo. ¿Desea abrir este ejemplo con sus modificaciones?
Comando de MATLAB
Ha hecho clic en un enlace que corresponde a este comando de MATLAB:
Ejecute el comando introduciéndolo en la ventana de comandos de MATLAB. Los navegadores web no admiten comandos de MATLAB.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list:
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)