Can't get INT8 precision to work

Hi, I’m working on a NVIDIA Jetson Xavier NX to convert ONNX models using TensorRT in Python and I’m having an issue with the conversion to INT8 precision.

By simply using the --int8 flag of the trtexec command (without the --calib flag), I am able to get the converted model but its predictions are wrong (e.g. for a gender recognition problem from images, the model predicts one of the two genders with 0.49 probability and the other with 0.51) despite the fact that the original model and the model with FP16 precision had very good performance.

I also tried to follow one of the samples on github (TensorRT/samples/python/int8_caffe_mnist/) thanks to which I was able to get a calibration cache for my models, but even these do not seem to be correct. In fact, the produced files have strange hexadecimal values (very close to infinity when converted to float32) and when I try to convert the model using the file with the --calib flag of trtexec I get a model that predicts NaN instead of the necessary probability values.

The TensorRT version is 8.0.1 and the models were converted from TensorFlow to ONNX using opset 13.

I hope I have been clear in my explanation of the problem and that I have not created the topic in the wrong place.

Thanks.

Hi, Please refer to the below links to perform inference in INT8

Thanks!

Hi, I’ve already tried following the links you gave me (In fact, I looked everywhere on the internet for some help) but the situation is still the same. Maybe there is some problem with the TensorRT version?

Thanks.

Hi,

We are moving this post to the Jetson Xavier NX forum to get better help.
Also, we recommend you to please try on the latest TensorRT version and let us know if you still face this issue.

You can try the TensorRT NGC container.

Thank you.