What's the default quantization mode for TensorRT PTQ

401616764 · July 31, 2021, 5:43am

According to TensorRT’s document, TensorRT only supports symmetric and uniform type quantization, which means quantization zero-point should always be 0.

But when I set the dynamic range(e.g. (0, 5.6845)) for network layers manually, I find TensorRT calculates a scale and a non-zero zero-point through the verbose logs. So does TensorRT support non-symmetric uniform type quantization which is in conflict with the document?

And are the weights quantized per channel by default in PTQ? Can the user configure it to be per tensor?

spolisetty · August 5, 2021, 10:12am

@401616764,

We only support symmetric Quant. In PTQ TRT sets the weights’ scales, so user cannot control weights quant.

Thank you.

Topic		Replies	Views
Weight quantization using min max calibration TensorRT	3	862	January 31, 2024
The threshold of quantization TensorRT	1	833	February 26, 2020
TensorRT quantization uses int8 or uint8 TensorRT tensorrt	1	856	June 6, 2023
What is the official suggestion to use weight only quantization / smooth quant in TensorRT? TensorRT	6	726	December 12, 2023
TensorRT - INT8 Quantization - weights - activations TensorRT	2	961	January 9, 2020
How to set my own quantized weigit and bias scale(not activation)? TensorRT tensorrt	1	345	January 15, 2021
Negative values encountered in unsigned quantization TensorRT	7	1087	October 22, 2021
Why the dynamic range is symmetric? TensorRT	3	869	March 3, 2020
The result of tensorrt qat is not equal to the result of pytorch qat in int8 mode TensorRT	5	489	June 5, 2020
TensorRT INT8 Quantization : weights + activations quantization TensorRT	4	2066	February 13, 2020

What's the default quantization mode for TensorRT PTQ

Related topics