HI, I have been working on the iNT8 calibration in TensorRt. I understand the quantization and the entropy (KL Divergence ) minimization which is done to quantize the weights. But there is no clear explanation of the underlying algorithm used fo this class. Can I get the reference which is implemented in this class or any reference to which explains it ?
Thank you