TensorRT INT8 inference accuracy

user48094 · May 6, 2022, 12:42pm

Description

When I quantify my segmentation model to fp16, the reasoning accuracy can reach 0.92, but when I quantify it to int8, the reasoning accuracy is only 0.59.

Environment

TensorRT Version: 7.2.0.14
GPU Type:
Nvidia Driver Version: NVIDIA Xavier NX
CUDA Version: 10.2
CUDNN Version: 8.0.0
Operating System + Version: ubuntu 18.04
Python Version (if applicable): 3.6
TensorFlow Version (if applicable):
PyTorch Version (if applicable): 1.6.0
Baremetal or Container (if container which image + tag):

Relevant Files

Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)

Steps To Reproduce

Please include:

Exact steps/commands to build your repro
Exact steps/commands to run your repro
Full traceback of errors encountered

NVES · May 6, 2022, 1:07pm

Hi, Please refer to the below links to perform inference in INT8

Thanks!

spolisetty · May 9, 2022, 8:51am

Hi,

Also looks like you’re using an old version of the TensorRT. We recommend you please try on the latest version 8.4.

Thank you.

Topic		Replies	Views
TensorRT TensorRT tensorrt , python	1	317	October 27, 2021
Int8 calibration TensorRT	1	2397	December 17, 2021
Int8 callibration with point cloud TensorRT	1	411	February 27, 2023
Int8 get the same result, but in FP16 the result is correct TensorRT	1	400	December 1, 2021
TensorRT python API inference is inconsistent with trtexec inference TensorRT tensorrt	1	1005	February 28, 2023
TensorRT INT8 conversion from an ONNX model TensorRT tensorrt , calibration , onnx	4	5595	July 29, 2024
TensorRT 8 : C++ inference gives different results compared to tensorflow python inference TensorRT	7	1362	October 5, 2021
TRT Engin in INT8 is much slower than FP16 TensorRT	4	1942	November 11, 2021
Is there any layer that fp16 supports but int8 does not？ TensorRT	5	485	December 1, 2021
Segmentation fault when using TensorRT to compile a model TensorRT	1	1386	June 27, 2022

TensorRT INT8 inference accuracy

Description

Environment

Relevant Files

Steps To Reproduce

Related topics