TensorRT gives diffent results than ONNX and Pytorch

Description

When creating a TensorRT engine from an ONNX file, and comparing the inference outputs from the two formats I receive different results (The difference is significant and not due to precision/optimizations).

Environment

TensorRT Version: 8.6.1.0
GPU Type: NVIDIA RTX A3000
Nvidia Driver Version: 535.54.03
CUDA Version: 12.2
CUDNN Version: 8.9.2
Operating System + Version: Ubuntu 20.04
Python Version (if applicable): 3.8.10
TensorFlow Version (if applicable): N/A
PyTorch Version (if applicable): 2.0.1+cu118
Baremetal or Container (if container which image + tag): N/A

Relevant Files

  1. model_640x640_FULL_STAT.onnx - ONNX file
  2. model_640x640_fp16_stat.engine - TensorRT engine file
  3. comp_onnx_trt.py - Python script the loads and compare the results for the 2 formats.
  4. zidane.jpg - file used as input to the networks.

Steps To Reproduce

  1. Build tensorRT engine from ONNX file :

trtexec --onnx=model_640x640_FULL_STAT.onnx --saveEngine=model_640x640_fp16_stat.engine --useSpinWait --fp16
( I also tried without --fp16)
2. run the attached python script (comp_onnx_trt.py): loads the ONNX and trt models, inject the same image file (attached) as input to them, compare the results and print the maximum difference between them.
3, The printed result is : max diff: 6.824637
4. when I compare the ONNX results to Pytorch using the same input, the difference is low (~0.01)

comp_onnx_trt.py (3.1 KB)
model_640x640_fp16_stat.engine (5.2 MB)
model_640x640_FULL_STAT.onnx (7.9 MB)

Hi @erez.h,

We could reproduce the issue. This is a known issue and will be fixed in future major releases. As a temporary workaround, we recommend you to use the FP32 precision.

Thank you.

Can we get engine output with the py file in this attachment? Also, when I run the cmd code, I get the error tensorrt trtexec (tensorrt v8502)usr/src/… is there a solution?
note=what model did you use exactly?

Hi,
I also tried using FP32 precision (removing “–fp16” flag) and the results are the same.
Do I need to use different configuration for the trtexec ? (than "trtexec --onnx=model_640x640_FULL_STAT.onnx --saveEngine=model_640x640_stat.engine --useSpinWait)
Thanks,
Erez

The above command should work fine.
If we don’t specify precision, TensorRT will use the “FP32” precision by default.

Thanks.
As I wrote I already tried this and got the same results

Hi
Can you please check if you have a valid workaround for this issue?
Also - when do you expect a version with a fix to be released ?

Hi!

I have the same problem.

Great difference from pytorch and tensorrt. It would be great to have some feedback for upcoming releases.

Hello. The same problem

TensorRT Version : 8.6.1.0
GPU Type : NVIDIA 2060
Nvidia Driver Version : 470.182.03
CUDA Version : 11.4
CUDNN Version : 8.4.0
Operating System + Version : Ubuntu 20.04
Python Version (if applicable) : 3.10.12
TensorFlow Version (if applicable) : N/A
PyTorch Version (if applicable) : 1.12.1+cu113
Baremetal or Container (if container which image + tag) : N/A