Question about TensorRT reproducibility on different architectures

nk_white · September 13, 2021, 4:45pm

Description

I have a general question about the expectation of TensorRT engine files.

I used Transfer Learning Toolkit/TAO Toolkit on a dGPU to train a custom YOLOv3 model. I iterated through several pruning and retraining steps until I had a model that worked really well on my training set and test sets. I then created a .etlt file and finally a TensorRT .engine file and ran inference on a test set on the dGPU and was happy with the results. Then, I moved the model to the Jetson Nano by creating a .engine file on the Nano, using the same tlt-converter command line as I did with the dGPU. However, the inference results - as measured by eye (bounding boxes in Deepstream 5.1) were much worse than on the dGPU. I don’t know of an easy way to just do inference on the Nano using a .engine file to test the results quantitatively. But my question is this: Should TensorRT engines produce the same results on different platforms? If so, then I will continue to try to figure out how to quantitatively measure inference on the TLT YOLOv3 model outside of Deepstream 5.1. If not, then how does one select the proper model on the dGPU to deploy to edge devices?

Environment

TensorRT Version: 7.2.1
GPU Type: Tesla V100
Nvidia Driver Version: 460.32.03
CUDA Version:
CUDNN Version:
Operating System + Version: CentOS7
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):

Relevant Files

Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)

Steps To Reproduce

Please include:

Exact steps/commands to build your repro
Exact steps/commands to run your repro
Full traceback of errors encountered

spolisetty · September 15, 2021, 5:54am

Hi @nk_white,

TensorRT engines do not produce the same results on different platforms. Please refer following post to know more details on your query.

Thank you.

nk_white · September 16, 2021, 10:54pm

Thank you for your reply. From the explanation, I interpret “floating point differences” to be qualitatively equal. The prediction performance I’m seeing is not qualitatively equal so I will continue on in an effort to figure out why my model is performing so poorly in DeepStream 5.1.

Topic		Replies	Views
TensorRT and Triton Server - different results each time TensorRT	1	572	May 31, 2021
TensorRT and Triton Server - different results each time Triton Inference Server (archived)	0	857	May 31, 2021
Non-deterministic TensorRT engine building TensorRT tensorrt	3	758	March 10, 2021
Tensorrt conversion of a model for jetson nano on host device TensorRT tensorrt , jetson-inference	4	858	April 27, 2023
Yolov5 + TensorRT results seems weird on Jetson Nano 4GB TensorRT	5	2207	January 24, 2022
Trtexec generates different engines when using the same platform/machine with the same onnx model TensorRT	3	1290	March 29, 2022
.engine generated on device A can`t be deployed to device B TensorRT tensorrt	2	484	May 3, 2023
Two TRT compiled engines that were generated from the same Onnx model show different inference average times TensorRT cudnn	2	264	August 11, 2024
Is TensorRT inference deterministic/reproducibile? TensorRT tensorrt	4	3073	December 1, 2020
Yolov5 + TensorRT results seems weird on Jetson Nano 4GB Jetson Nano yolo	7	4803	January 4, 2022

Question about TensorRT reproducibility on different architectures

Description

Environment

Relevant Files

Steps To Reproduce

Related topics