INT8 inference with different results

dell3253 · September 19, 2018, 3:53am

Hi. Im using tensorRT int8 to accelerating a semantic segmentation net. But using the same calibration table file I get results with slight difference. Can anyone tells me why?

Thank you

dell3253 · September 19, 2018, 5:27am

BTW, Im using trt4.0.1.6

NVES · September 19, 2018, 3:35pm

Hello, can you provide details on the platforms you are using?

Linux distro and version
GPU type
nvidia driver version
CUDA version
CUDNN version
Python version [if using python]
Tensorflow version
TensorRT version (4.0.1.6)
Jetson platform (if using)

dell3253 · September 27, 2018, 9:05am

Linux distro and version: Ubuntu16.04
GPU type: gtx1080ti
nvidia driver version: 390.77
CUDA version: 9.0.176
CUDNN version: 7.0.5
Python version [if using python]: using tensorrt c++ api
Tensorflow version: using tensorrt c++ api
TensorRT version (4.0.1.6)
Jetson platform (if using)

I found another problem that the fp32 and int8 result of the same image is different. Then I print the outputs of the layers and found the output differs at a conv layer while previous layers have the same outputs. And I found if I remove the relu layer before the conv layer, the outputs become similar(But not the same).

Is there anything wrong with the relu layer in int8 inference? Or because the int8 output is overflowed?
Thank you.

NVES · September 28, 2018, 4:23pm

Hello,

a request from engineering team: it’d be great if you could provide a simplified test case or example that exhibits the problem you are seeing. It would then be easier for us to look into the issue further.

thank you

NVES · October 5, 2018, 3:15pm

Hello,

With same calibration table, we would expect to see deterministic output.

“the fp32 and int8 result of the same image is different”
What do you mean by result here? Is it the accuracy or output label is different?

It would be great if you can provide repro steps.

Topic		Replies	Views
TensorRT 4.0.1 - Int8 precision Vs. FP32 precision objects detections inference results TensorRT	12	3612	December 1, 2019
Int8 inference result is differ between 2080ti and Xavier TensorRT	8	766	June 8, 2022
I am using TensorRT int8 inference, but get wrong results. error occurs at the last three layer. TensorRT	1	731	May 23, 2018
use int8 the result would be 0.1% differ TensorRT	3	720	June 19, 2019
Int8 get the same result, but in FP16 the result is correct TensorRT	1	440	December 1, 2021
TensorRT 8.0.3 imagenet resnet model INT8 conversion identical output with different input after calibration TensorRT tensorrt	3	1304	December 23, 2021
Under the int8 mode, the output of onnxruntime and tensorRT are inconsistent TensorRT	3	874	August 19, 2022
The result of Int8 model is unstable on TensorRT 3 ~ 4 TensorRT	0	616	June 19, 2019
Int8 problem TensorRT tensorrt	19	1298	May 11, 2021
TensorRT INT8 inference accuracy TensorRT	2	565	May 9, 2022

INT8 inference with different results

Related topics