Hi, I’m working on a NVIDIA Jetson Xavier NX to convert ONNX models using TensorRT in Python and I’m having an issue with the conversion to INT8 precision.
By simply using the --int8 flag of the trtexec command (without the --calib flag), I am able to get the converted model but its predictions are wrong (e.g. for a gender recognition problem from images, the model predicts one of the two genders with 0.49 probability and the other with 0.51) despite the fact that the original model and the model with FP16 precision had very good performance.
I also tried to follow one of the samples on github (TensorRT/samples/python/int8_caffe_mnist/) thanks to which I was able to get a calibration cache for my models, but even these do not seem to be correct. In fact, the produced files have strange hexadecimal values (very close to infinity when converted to float32) and when I try to convert the model using the file with the --calib flag of trtexec I get a model that predicts NaN instead of the necessary probability values.
The TensorRT version is 8.0.1 and the models were converted from TensorFlow to ONNX using opset 13.
I hope I have been clear in my explanation of the problem and that I have not created the topic in the wrong place.
Thanks.