Data type (cuDNN)

Inference computation precision

Description

App Configuration Pane: Deep Learning

Configuration Objects: coder.CuDNNConfig

Specify the precision of the inference computations (32-bit float vs 8-bit integer) in supported layers.

INT8 precision requires a CUDA^® GPU with minimum compute capability of 6.1. Compute capability of 6.2 does not support INT8 precision. Use the ComputeCapability property of the GpuConfig object to set the appropriate compute capability value.

Note

When performing inference in INT8 precision using cuDNN version 8.1.0, issues in the NVIDIA^® library may cause significant degradation in performance.

Dependencies

To enable this parameter, you must set Deep learning library to cuDNN.

Settings

fp32

This setting is the default setting.

Inference computation is performed in 32-bit floats.

int8

Inference computation is performed in 8-bit integers.

Programmatic Use

Property: DataType

Values: 'fp32' | 'int8'

Default: 'fp32'

Version History

Introduced in R2020a