What affects the floating point accuracy of an tensorrt engine output?

Hiromitsu.Matsuura · April 27, 2023, 8:41am

Firstly, I post this issue to TensorRT forum.
However, I heard it is better to post Jetson forum from AakankashaS.
So I moved this topic to Jetson forum.

Description

We know order of floating-point calculation may affect to the floating-point accuracy as follows.

CUDA Floating Point (nvidia.com)

So I want to know which modules are related to the order of calculation and output of tenserRT engine.

i) tensorRT engine only.
ii) TensorRT runtime
iii) others

Environment

TensorRT Version: 8.5.2.2
GPU Type: jetson orin 32gb
Nvidia Driver Version: 35.2.1
CUDA Version: 11.4
CUDNN Version: 8.6.0
Operating System + Version: Ubuntu 20.04.6 LTS
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):

Relevant Files

Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)

Steps To Reproduce

Please include:

Exact steps/commands to build your repro
Exact steps/commands to run your repro
Full traceback of errors encountered

AakankshaS · April 27, 2023, 9:07am

Hi,

This looks like a Jetson issue. Please refer to the below samples in case useful.

For any further assistance, we will move this post to to Jetson related forum.

Thanks!

Hiromitsu.Matsuura · April 27, 2023, 9:14am

Dear [AakankshaS],(Profile - AakankshaS - NVIDIA Developer Forums)

Thank you for your information.

However, I cannot get the answer from samples.

Could you move this this post to Jetson related forum?

Regards,
hiro

AastaLLL · May 2, 2023, 3:20am

Hi,

When inferring with TensorRT, you will need to build the model (ex. ONNX) into the TensorRT engine.
At that point, you can choose which data format you want to convert to. For example, int8, fp32, and fp16.

If the same engine model is inferred, the output is expected to be similar.
Thanks.

Hiromitsu.Matsuura · May 8, 2023, 12:25am

Dear AastaLLL,

If the same engine model is inferred, the output is expected to be similar.

We plan to use fp16 format.
And we know order of floating-point calculation may affect to the floating-point accuracy.

So I want to which modules are related to the order of calculation and output of tenserRT engine.

i) tensorRT engine only.
ii) TensorRT runtime
iii) others

Regrards,
hiro

AastaLLL · May 9, 2023, 8:16am

Hi,

The accuracy loss comes from data precision loss so it should be i).

If PTQ (post-training quantization) is used, the accuracy loss cannot be reduced.
But if using QAT (quantization-aware training), DNN can learn the possible accuracy loss and react.

Thanks.

Hiromitsu.Matsuura · May 9, 2023, 8:35am

Hi, AastaLLL,

The accuracy loss comes from data precision loss so it should be i).

My question is not data precision loss but operations order and accuracy .
Please check 2.2 Operations and Accuracy in following NVIDIA site.

CUDA Floating Point (nvidia.com)

The floating point value of ((A+B)+C) does not equal the value of (A+(B+C)) as follows.

So I want to confirm which modules are related to the order of calculation.

Regards,
hiro

AastaLLL · May 10, 2023, 4:23am

Hi,

The implementation is included in the TensorRT runtime.
TensorRT engine only contains the serialized quantized data.

Could you share more about your use case?
Do you want to minimize the accuracy loss when inferring with fp16 mode?

Thanks.

Hiromitsu.Matsuura · May 10, 2023, 5:05am

Hi, AastaLLL,

We are considering the output validation of our program.
If the output values are only related to tensorRT engine, we can focus on the verification of tensorRT engine.

However, we understand that tensorRT runtime may be related to the output accuracy.

Regards,
hiro

system · May 30, 2023, 2:43am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
model accuracy penalty with tensorRT on jetson TX2 TensorRT	0	862	June 7, 2019
model accuracy penalty with tensorRT on jetson TX2 Jetson TX2	7	736	October 18, 2021
Is TensorRT “floating-point 16 precision mode” non-deterministic on Jetson TX2? Jetson TX2	6	1583	October 18, 2021
Tensorrt loss accuracy when test TensorRT tensorrt	6	2322	February 24, 2022
Outputs of pytorch and tensorrt be accurately aligned TensorRT cudnn	1	325	March 30, 2024
Accuracy drop in TensorRT TensorRT	7	1253	September 14, 2020
Onnx output differs largely to TRT engine output TensorRT	14	2147	February 25, 2023
Transformer accuracy shows significant loss in FP16 reasoning on Jetpack6 tensorrt8.6.11 Jetson AGX Orin tensorrt	5	183	September 3, 2025
TensorRT gives different results on Jetson Orin Jetson AGX Orin tensorrt , nvbugs	6	930	June 5, 2023
Different FP16 inference with tensorrt and pytorch TensorRT	5	4674	October 25, 2021

What affects the floating point accuracy of an tensorrt engine output?

Description

Environment

Relevant Files

Steps To Reproduce

Related topics