Difference between Engine file output in Google Colab (Python) and Local Machine (C++)

aravind.chakravarti · January 14, 2024, 8:20am

Description

I am learning how to perform inference on the engine file.

To start with,

I created a Pytorch based classification model in Google Colab.
Trained the classification model for MNIST dataset, verified that trained model achieves 95% + accuracy.

Exported the model to ONNX (within google colab), and verified that model is working.

At this point I exported the output vector (which is of 10 neurons, so 1x10)

** ONNX OUTPUT IN COLAB ** 
NEURON   OUTPUT_VALUE
 0 --> -8.656696319580078
 1 --> -8.746328353881836
 2 --> -7.678134441375732
 3 --> -8.662064552307129
 4 --> -0.0022408869117498398
 5 --> -8.611289978027344
 6 --> -8.606916427612305
 7 --> -7.697903633117676
 8 --> -8.510454177856445
 9 --> -8.295673370361328

Now exported the model TensorRT (again within google colab) and verified that model is working,

Just like before, again I exported the output vector and below is the output

ENGINE FILE COLAB
NEURON   OUTPUT_VALUE
0 --> -8.653183937072754
1 --> -8.742051124572754
2 --> -7.669453144073486
3 --> -8.658066749572754
4 --> -0.0022506495006382465
5 --> -8.608777046203613
6 --> -8.602890968322754
7 --> -7.695576190948486
8 --> -8.506699562072754
9 --> -8.293564796447754

As we can see, both the outputs closely match. Now I wanted to explore further with C++.
I downloaded this ONNX file into my local Linux machine (whose specifications are given below in Environment section)
Used /usr/src/tensorrt/bin/trtexec --onnx=MNIST_Classifier.onnx -saveEngine=MNIST_Classifier_f16.engine --fp16 to export to engine file.

Now, I used a C++ code to execute the engine file. To my surprise, I got output tensor like this,

ENGINE FILE LOCAL MACHINE WITH C++
NEURON   OUTPUT_VALUE
0 --> -8.794
1 --> -8.88364
2 --> -8.26091
3 --> -8.79937
4 --> -0.00169589
5 --> -8.74809
6 --> -8.74422
7 --> -8.21721
8 --> -8.64776
9 --> -8.37213

Basically what I am seeing is ONNX_IN_COLAB == TENSORT_RT_IN_COLAB != TENSORT_RT_IN_LOCAL_MACHINE_C++

Is this difference in output (output tensor) is expected?

Environment

TensorRT Version: 8.4.1
GPU Type: A5000
Nvidia Driver Version:
CUDA Version: 11.6
CUDNN Version:
Operating System + Version: Ubuntu 20.04
Python Version (if applicable): 3.8.10
TensorFlow Version (if applicable):
PyTorch Version (if applicable): ‘2.0.1+cu117’
Baremetal or Container (if container which image + tag):

Relevant Files

For local machine C++ inference, I used code which is in below link (and modified to suit for classification),
YOLOv8-TensorRT/csrc/detect/normal at main · triple-Mu/YOLOv8-TensorRT (github.com)

spolisetty · June 10, 2024, 6:38am

Hi,

Slight variations between Colab and your local machine’s TensorRT output are expected due to potential non-determinism and optimization techniques.

Please let us know if the difference is relatively high and impacts accuracy.
We recommend you please try on the latest TensorRT version 10.0.1.

aravind.chakravarti · June 10, 2024, 6:48am

@spolisetty

Thanks for replying. I understand that the difference is expected due to optimization.
And you mean say,
Pytorch to ONNX → Lossless
ONNX to TensorRT → Slightly lossy?

One more question: let’s say that I have many loops (see the representative image below) in my convolutional neural network. In that case, non-determinism and optimization techniques during engine file conversion may result in higher conversion errors, correct?

Please let us know if the difference is relatively high and impacts accuracy.

Thanks & Regards,
Aravind

Topic		Replies	Views
Different output from TRT engine in python and c++ TensorRT	12	1794	October 12, 2021
The TensorRT engine produces different inference results when loaded using Python compared to C++ TensorRT cudnn , deepstream	2	126	April 28, 2025
ONNX Model and Tensorrt Engine gives different output TensorRT tensorrt , onnx	13	5652	June 29, 2022
TensorRT gives diffent results than ONNX and Pytorch TensorRT	8	1833	September 28, 2023
TensorRT get different result in python and c++ TensorRT	21	3104	August 24, 2022
Output from ONNX inference and trt inference are different Jetson TX2 tensorrt , tensorflow , nvbugs	6	943	October 18, 2021
Does having different environment give you different result? TensorRT	1	307	April 25, 2022
BUG: Output TRT engine from trtexec has completely different inference than input model TensorRT tensorrt , debugging-and-troubleshooting	3	2356	January 4, 2022
TensorRT 8 : C++ inference gives different results compared to tensorflow python inference TensorRT	7	1460	October 5, 2021
Inference of model using tensorflow/onnxruntime and TensorRT gives different result Jetson TX2 tensorrt	20	2768	October 18, 2021

Difference between Engine file output in Google Colab (Python) and Local Machine (C++)

Description

Environment

Relevant Files

@spolisetty

Related topics