INT8 inference with different results

Hi. Im using tensorRT int8 to accelerating a semantic segmentation net. But using the same calibration table file I get results with slight difference. Can anyone tells me why?

Thank you

BTW, Im using trt4.0.1.6

Hello, can you provide details on the platforms you are using?

Linux distro and version
GPU type
nvidia driver version
CUDA version
CUDNN version
Python version [if using python]
Tensorflow version
TensorRT version (
Jetson platform (if using)

Linux distro and version: Ubuntu16.04
GPU type: gtx1080ti
nvidia driver version: 390.77
CUDA version: 9.0.176
CUDNN version: 7.0.5
Python version [if using python]: using tensorrt c++ api
Tensorflow version: using tensorrt c++ api
TensorRT version (
Jetson platform (if using)

I found another problem that the fp32 and int8 result of the same image is different. Then I print the outputs of the layers and found the output differs at a conv layer while previous layers have the same outputs. And I found if I remove the relu layer before the conv layer, the outputs become similar(But not the same).

Is there anything wrong with the relu layer in int8 inference? Or because the int8 output is overflowed?
Thank you.


a request from engineering team: it’d be great if you could provide a simplified test case or example that exhibits the problem you are seeing. It would then be easier for us to look into the issue further.

thank you


With same calibration table, we would expect to see deterministic output.

“the fp32 and int8 result of the same image is different”
What do you mean by result here? Is it the accuracy or output label is different?

It would be great if you can provide repro steps.