hi there. I am trying out on reducing the precision from FP32 to FP16, and it is quite straight forward, but I can only get limited resources on how to do inferencing using INT8 (configuring the ‘config’ and setting up the calibration) in Python language. And good reference on how to do this for TensorRT 8.0.1
Since INT8 changes data type from floating into integer, an extra calibration process is required.
You can find a calibration example from the TensorRT sample below:
Below is another good tutorial from the users for your reference:
Thanks so much! I will try it out
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.