TensorRT maps [-maxRange, maxRange] to [-127, 127] in quantization, the dynamic range is symmetric. But if the middle value of all original FP32 values is not 0, how to map original distribution to quantize distribution?
Did it use shift or some other method?
TRT maps [-maxRange, maxRange] to [-127, 127], the range is symmetric.
No shift weights are allowed for quantizing and dequantizing scale layers as only symmetric quantization is supported.
You can use setting per tensor dynamic range, please refer below link:
(1)You mean even if the distribution of FP32 values are asymmetric(Figure 1), the range should be symmetric like -T1 and T1?
Furthermore, if the distribution of FP32 values are symmetric, but its middle value is not 0(Figure 2), the range still should be symmetric like -T2 and T2?
If the answers of above both cases are yes, they will lost accuracy significantly after calibration, is it?
(2)When the sample sample_int8 mnist was executed, the CalibrationTable file was generated:
(Unnamed Layer* 4) [Fully Connected]_output: 3dc6a9ed
The 1st field is layer name? Did all layers have been processed(calibrated) actually? or only above layers were processed? If the layers were selected, were they selected by manual(configure)? or by TRT(automatic)?
- Yes, the calibration is symmetric. The dynamic range usually does not cover max point for tensors, because that will result in poor accuracy.
- Only layers with better performance at INT8 precision are converted and 1st value is the layer name.
If the data all positive or all negative, we lost half of dynamic range, but that’s usually not a big problem, we actually only lost 1 bit (e.g. 0 to 127) out of 8 bits (int8 -127 to 127)