Building int8 model: combining direct dynamic range setting with calibration

eric.crawford · November 18, 2024, 6:07pm

Hi,

I’m building a TensorRT engine for int8 precision, and have to use implicit quantization because I’m targeting the DLA. Is it possible to combine direct setting of dynamic range (via tensor.set_dynamic_range()) with calibration? There are some tensors that I know the dynamic range for, and can set a tighter range than I’d get through calibration, but for the rest of the tensors I want the dynamic range to come from the calibration data. In particular, if I directly set the dynamic range of some tensors using tensor.set_dynamic_range(), will that value be wiped out if I subsequently run calibration?

Thanks!

eric.crawford · November 19, 2024, 8:46pm

To answer my own question, it looks like you CAN combine direct precision setting and calibration. If you set the dynamic range manually, it will NOT be over-written by calibration. Seeing output like:

11/19/2024-12:41:44] [TRT] [V] User overriding scale with scale and zero-point Quantization(scale: {0.0472441,}, zero-point: {0,})
[11/19/2024-12:41:44] [TRT] [V] User overriding scale with scale and zero-point Quantization(scale: {0.0472441,}, zero-point: {0,})
[11/19/2024-12:41:44] [TRT] [V] INT8 Inference Tensor scales and zero-points: /network/backbone/maxpool/MaxPool_output_0 scale and zero-point Quantization(scale: {0.0472556,}, zero-point: {0,})
[11/19/2024-12:41:44] [TRT] [V] User overriding scale with scale and zero-point Quantization(scale: {0.0472441,}, zero-point: {0,})
[11/19/2024-12:41:44] [TRT] [V] User overriding scale with scale and zero-point Quantization(scale: {0.0472441,}, zero-point: {0,})

So it seems to be respecting the user precision override.

Topic		Replies	Views
Int8 Calibration: constraint on Tensors to be not uniformly zeros TensorRT	0	521	March 8, 2019
example script to calculate dynamic range TensorRT	1	809	December 11, 2018
TensorRT 6.0.1 - trtexec: Users must provide dynamic range for all tensors that are not Int32 TensorRT	4	4048	June 3, 2022
Why the dynamic range is symmetric? TensorRT	3	869	March 3, 2020
about onnx tensorRT int8 dynamic range(without calibration) method TensorRT	0	807	June 25, 2019
INT8 vs FP16 results Jetson AGX Xavier tensorrt , performance	5	4219	October 18, 2021
Mix precision: I have to give a dynamic range to FLOAT32 layer? TensorRT	10	1653	April 6, 2020
mixed-precision-calibration (int8 and fp16-precision-Layers) with TensorRT Deep Learning (Training & Inference) mixed-precision	0	1461	September 3, 2019
TensorRT: Int8 calibration with hand-tuned scale factors Jetson TX2	6	2872	October 18, 2021
Post-Training INT8 Quantization -> TensorRT Calibration Table TensorRT	7	1260	May 14, 2020

Building int8 model: combining direct dynamic range setting with calibration

Related topics