Description
I am learning how to perform inference on the engine file.
To start with,
-
I created a Pytorch based classification model in Google Colab.
-
Trained the classification model for MNIST dataset, verified that trained model achieves 95% + accuracy.
-
Exported the model to ONNX (within google colab), and verified that model is working.
- At this point I exported the output vector (which is of 10 neurons, so 1x10)
** ONNX OUTPUT IN COLAB ** NEURON OUTPUT_VALUE 0 --> -8.656696319580078 1 --> -8.746328353881836 2 --> -7.678134441375732 3 --> -8.662064552307129 4 --> -0.0022408869117498398 5 --> -8.611289978027344 6 --> -8.606916427612305 7 --> -7.697903633117676 8 --> -8.510454177856445 9 --> -8.295673370361328 -
Now exported the model TensorRT (again within google colab) and verified that model is working,
- Just like before, again I exported the output vector and below is the output
ENGINE FILE COLAB NEURON OUTPUT_VALUE 0 --> -8.653183937072754 1 --> -8.742051124572754 2 --> -7.669453144073486 3 --> -8.658066749572754 4 --> -0.0022506495006382465 5 --> -8.608777046203613 6 --> -8.602890968322754 7 --> -7.695576190948486 8 --> -8.506699562072754 9 --> -8.293564796447754 -
As we can see, both the outputs closely match. Now I wanted to explore further with C++.
-
I downloaded this
ONNXfile into my local Linux machine (whose specifications are given below in Environment section) -
Used
/usr/src/tensorrt/bin/trtexec --onnx=MNIST_Classifier.onnx -saveEngine=MNIST_Classifier_f16.engine --fp16to export to engine file. -
Now, I used a
C++code to execute the engine file. To my surprise, I got output tensor like this,ENGINE FILE LOCAL MACHINE WITH C++ NEURON OUTPUT_VALUE 0 --> -8.794 1 --> -8.88364 2 --> -8.26091 3 --> -8.79937 4 --> -0.00169589 5 --> -8.74809 6 --> -8.74422 7 --> -8.21721 8 --> -8.64776 9 --> -8.37213
Basically what I am seeing is ONNX_IN_COLAB == TENSORT_RT_IN_COLAB != TENSORT_RT_IN_LOCAL_MACHINE_C++
Is this difference in output (output tensor) is expected?
Environment
TensorRT Version: 8.4.1
GPU Type: A5000
Nvidia Driver Version:
CUDA Version: 11.6
CUDNN Version:
Operating System + Version: Ubuntu 20.04
Python Version (if applicable): 3.8.10
TensorFlow Version (if applicable):
PyTorch Version (if applicable): ‘2.0.1+cu117’
Baremetal or Container (if container which image + tag):
Relevant Files
For local machine C++ inference, I used code which is in below link (and modified to suit for classification),
YOLOv8-TensorRT/csrc/detect/normal at main · triple-Mu/YOLOv8-TensorRT (github.com)

