HI, I have been working on the iNT8 calibration in TensorRt. I understand the quantization and the entropy (KL Divergence ) minimization which is done to quantize the weights. But there is no clear explanation of the underlying algorithm used fo this class. Can I get the reference which is implemented in this class or any reference to which explains it ?

Here is an useful tutorial for INT8 on TensorRT:

You can also check this source for calibrator information:

Is there a way to save quantized weights calibrated by the Int8EntropyCalibrator?