TensorRT gives different results on Jetson Orin

carlosn08rr · March 30, 2023, 11:13am

Hi,
We’ve been using TensorRT for several years with different neural networks on different platforms, jetson (xavier), desktop (2080), server (T4), …
We’ve just started supporting Jetson Orin with our current models and we have found an odd issue.
Some networks are returning different values on Jetson Orin AGX with JetPack 5.1

I have created an example using pose estimation to reproduce the problem, you can download the code from the following link:

https://ha-embedded-team-public.s3.eu-west-2.amazonaws.com/nvidia-forum/pose-estimation-trt.tar.xz

Steps to reproduce the problem:
1 - Install onnxruntime
$> pip3 install onnxruntime
2 - Build plan file using trtexec
$> ./trtexec_pose.sh build
3 - Run plan file and save outputs
$> ./trtexec_pose.sh infer
4 - Get results for the same input using onnxruntime
$> python3 onnxruntime_pose.py
5 - Finally compare trtexec and onnxruntime results
$> python3 check_results.py

Same steps on Jetson Xavier AGX (JetPack 4.5) or Tesla T4 or RTX 2080ti … give equivalent results when we use onnxruntime or tensorrt.

AastaLLL · March 31, 2023, 3:18am

Hi,

Thanks. We are going to give it a try.
Have you also tested this on JetPack 5.1.1 which just be released recently?

Thanks.

AastaLLL · March 31, 2023, 3:55am

Hi,

We have confirmed this issue can be reproduced on Orin with JetPack 5.1.1.
Our internal team is checking on this. Will update more information for you later.

--- output mismatch: (2, 28960) >>> 2.3787856101989746 vs 2.37354 | 0.005245610198974404
--- output mismatch: (2, 28962) >>> 124.85875701904297 vs 124.846 | 0.012757019042965112
--- output mismatch: (2, 28963) >>> 91.82755279541016 vs 91.7947 | 0.03285279541015029
--- output mismatch: (2, 28964) >>> 0.26270583271980286 vs 0.261643 | 0.001062832719802842
--- output mismatch: (2, 28966) >>> 0.11150957643985748 vs 0.109445 | 0.0020645764398574823
...

Thanks.

AastaLLL · May 2, 2023, 2:33am

Hi,

Thanks for your patience.

We found this issue is related to tf32 kernels.
As a workaround, please run trtexec with the --noTF32 flag.

The error after applying the workaround is:

err max 0.000645141601566479 mean 3.0629781962014103e-06

Thanks.

carlosn08rr · May 2, 2023, 8:09am

Thank you for the workaround.

system · May 30, 2023, 8:24am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

AastaLLL · June 5, 2023, 5:52am

Hi,

Is noTF32 fixed your issue?
Thanks.

Topic		Replies	Views
TensorRT gives different results Jetson Orin Nano tensorrt	2	84	January 2, 2025
Skipping tactic 0x0000000000000000 due to Myelin error: Platform (Cuda) error Jetson Orin NX tensorrt	25	2407	January 25, 2023
TensorRT problem on NVIDIA APEX ORIN NX TensorRT tensorrt , jetson-inference , cudnn	1	51	August 29, 2024
Keras->Onnx->TensorRT Jetson AGX Orin tensorrt	4	181	September 25, 2024
Onnx to TensorRT mismatch Jetson Orin NX tensorrt , cuda , cudnn , onnx	11	1042	January 15, 2024
Tensorrt conversion on Orin platform Jetson AGX Orin tensorrt	4	35	August 13, 2025
TensorRT 8.6 not running properly on Orin NX with Jetpack 6 Jetson AGX Orin tensorrt , generative_ai	6	1066	December 25, 2023
Can't run nvcr.io/nvidia/l4t-tensorrt:r8.2.1-runtime on Orin AGX Jetson AGX Orin tensorrt	19	1207	May 13, 2022
Onnx -> TensorRT. No speed difference between models TensorRT	1	489	June 24, 2021
TensorRT 8.6 Performance Issue in AGX Orin 32Gb Jetson AGX Orin tensorrt	9	466	February 27, 2024

TensorRT gives different results on Jetson Orin

Related topics