Onnx vs tensorrt different inference result

nvid · November 17, 2022, 9:50am

Description

I have a bigger onnx model that is giving inconsistent inference results between onnx runtime and tensorrt.

Environment

TensorRT Version: 7.1.3
GPU Type: TX2
CUDA Version: 10.2.89
CUDNN Version: 8.0.0.180
Operating System + Version: Jetpack 4.4 (L4T 32.4.3)

Relevant Files

reduced.onnx (62.0 KB)

Steps To Reproduce

Using
polygraphy debug reduce bigger.onnx -o reduced.onnx --check polygraphy run polygraphy_debug.onnx --onnxrt --trt --trt-outputs mark all --onnx-outputs mark all --fail-fast

I was able to reduce it down to this onnx file.

And running this
polygraphy run reduced.onnx --trt --onnxrt --trt-outputs mark all --onnx-outputs mark all --fail-fast
will fail with message
FAILED | Output: 'model_2/model/block_2_add/add:0' | Difference exceeds tolerance (rel=1e-05, abs=1e-05)

Can you please help to resolve the accuracy difference(I think that’s the problem) to get matching inference result between onnx and tensorrt?

NVES · November 17, 2022, 10:07am

Hi,
Request you to share the ONNX model and the script if not shared already so that we can assist you better.
Alongside you can try few things:

validating your model with the below snippet

check_model.py

import sys
import onnx
filename = yourONNXmodel
model = onnx.load(filename)
onnx.checker.check_model(model).
2) Try running your model with trtexec command.

In case you are still facing issue, request you to share the trtexec “”–verbose"" log for further debugging
Thanks!

nvid · November 17, 2022, 11:19am

No error on check_model.py

Here is the log for running reduce.onnx.
No error running trtexec on the reduced.onnx (nor on the bigger.onnx file, since I have done it many times already. I won’t be able to upload the bigger.onnx file)

trtexec_verbose.log (99.3 KB)

Is it possible for you guys to reproduce the polygraphy run (exceeding default tolerance level)?

spolisetty · November 29, 2022, 6:10am

Hi,

Sorry for the delayed response.
We recommend you to please use the latest TensorRT version 8.5.

We think the polygraphy issue is just a tolerance issue. With –atol 1e-4 --rtol 1e-4, the polygraphy check could get passed, while the default value is –atol 1e-5 --rtol 1e-5.
We think 1e-4 is a fairly reasonable tolerance.

Are you looking for 100% numerical accuracy in comparison with ONNX-Runtime results?
Or do you have a benchmarking metric for inference accuracy?

Thank you.

Topic		Replies	Views
Tensorrt8.5 inference different with origin onnx model TensorRT	5	1511	January 23, 2023
TensorRT 10.1: Different inference results of onnxruntime and tensorrt TensorRT	2	241	August 21, 2024
TensorRT gives diffent results than ONNX and Pytorch TensorRT	8	1830	September 28, 2023
Big difference between infer results of onnxruntime and tensorrt TensorRT cudnn	3	217	May 8, 2025
Tensorrt 8.6 GA : C++ Inference gives diffrence results compared to onnx \|\| pt model python inference TensorRT	3	740	September 20, 2023
Incorrect inference results after converting from ONNX to TRT with trtexec TensorRT tensorrt , python , onnx	4	1686	December 9, 2022
ONNX v TRT Output Mismatch Tolerances TensorRT tensorrt , onnx	7	1819	March 2, 2022
TensorRT model inference result is not correctly TensorRT tensorrt , tensorflow , onnx	1	693	July 1, 2022
TensorRT 8 : C++ inference gives different results compared to tensorflow python inference TensorRT	7	1460	October 5, 2021
Trt ouput mismatch with onnx output TensorRT cudnn	1	79	February 28, 2025