Incorrect inference results - onnx/pytorch/tensorrt/c++ on Xavier AGX

mohapatra.sambit8467 · February 10, 2022, 1:04pm

Hello,

I have a trained Pytorch model for object detection that takes in bird’s eye view (BEV) of a lidar point cloud and produces keypoint masks for cars in the BEV.

Everything runs fine in Pytorch and I export the model to onnx.
I run inference again using onnxruntime and this is also fine.

src.7z (12.2 KB)
torch-onnx-standalone (1).zip (53.7 MB)
bevdetnet_3x320x480.onnx (57.3 MB)

Attached is a standalone project that includes a sample BEV input as npy file as well as some reference outputs which are correct. It also has the onnx model inside.

Now I wish to run inference in C++ using onnx-tensorrt on the Xavier AGX dev kit.
I am able to build the engine using trtexec and run inference. But, all the results seem to be wrong.

Also attached the complete C++ source code for this.

Please help me understand what’s going wrong.

Best Regards
src.7z (12.2 KB)
torch-onnx-standalone (1).zip (53.7 MB)

AastaLLL · February 11, 2022, 7:36am

Hi,

Thanks for reporting this to us.
Just want to clarify first, do you use TensorRT API or ONNXRuntime with TRT accelerated?

Thanks.

AastaLLL · February 15, 2022, 7:36am

Hi,

Is this still an issue that needs our help?
Thanks.

mohapatra.sambit8467 · February 17, 2022, 2:00am

Hi,

I use Tensorrt C++ APIs.
I did a pure onnxruntime-CPU test just to make sure that the onnx export works fine. It has nothing to do with tensorrt.

Thanks.

mohapatra.sambit8467 · February 17, 2022, 2:19am

I followed the quick start semantic segmentation example. I think the source of confusion mainly lies in copying data to and from the GPU. My input is a 3 channel BEV image of size 480x320 (sizes and number of channels can vary later, fixed batch size of 1 is fine for now). The pixel values are floats which are then normalized [0, 1] by simply dividing by max element channel-wise. I use Eigen matrices in row-major form to make up my input. A class constructs an Eigen tensor (array of 3 matrices), as you can see in the attached code.

The expected outputs are again all floats of sizes - 480x320x4, 480x320x37, 480x320x3, 480x320x2

A simple example of how to do this without going through too many complicated inter-conversions and type-castings would help me a lot.

Sorry for the delayed response and please let me know if I can help anymore in debugging this issue.

Best Regards
Sambit

mohapatra.sambit8467 · February 19, 2022, 3:55am

Hi,

Is there any update to this yet?

Best Regards
Sambit

AastaLLL · February 22, 2022, 7:06am

Hi,

It’s recommended to check our /usr/src/tensorrt/samples/sampleMNIST sample first.

Although it reads images from the .pgm file, it converts the data into the floating type and subtracts the mean.
Please check the sample for some ideas for your use case.

Thanks.

mohapatra.sambit8467 · February 22, 2022, 11:29am

Hi,

Thank for the hint. But, I think I followed the sample onnx mnist example and then also the quick start guide. The present code I have shared is basically a copy of the quick start guide with only necessary changes.

Therefore, I would be really grateful if you could suggest me some other alternatives or please have a look at the code I shared.

However, recently, I rebuilt the trt engine using trtexec with verbose mode on.
I noticed that I get messages like “ConvTransposed2D does not have an equivalent in this tactic, skipping…”

Could this be the cause of some problems?
My network uses transposed convolutions and sigmoid activations in between.

I am really lost here since it’s impossible for me to debug what’s going wrong once the input passes over to tensorrt.
Please help me me figure it out.

Best Regards
Sambit

AastaLLL · March 10, 2022, 6:51am

Hi,

Sorry for the late update.
If the message is a warning, it should not affect the accuracy but only some performance impact.

We are going to check this internally.
So if we compare the TensorRT and ONNXRuntime results, we should be able to reproduce this issue.
Is this correct?

Thanks.

mohapatra.sambit8467 · March 14, 2022, 7:50pm

Hi,

Yes, exactly!! The tensorrt result and onnx runtime results should match.

Best Regards
Sambit

mohapatra.sambit8467 · March 14, 2022, 7:52pm

Some additional inputs:
Network uses

convolution with strides > 1 and dilation > 1
transposed conv
relu
batchnorm
sigmoid
channel-wise multiplication

mohapatra.sambit8467 · March 16, 2022, 7:26pm

Another point - the network was trained on 2 GPUs using DataParallel. Does that make a difference?

AastaLLL · March 22, 2022, 7:24am

Hi,

We can get a very similar result with the attached source.
Would you mind giving it a try?

inference_onnx.py (1.2 KB)
inference_trt.py (2.2 KB)

Although the sample is python-based, TensorRT should output the identical result between C++ and Python.
Maybe there are some differences in the data pre-processing between C++ and Python?

Thanks

system · April 13, 2022, 5:29am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
YOLOv4 TensorRT inference results wayy off, but onnxruntime is not TensorRT tensorrt	7	961	June 7, 2022
TensorRT 8 : C++ inference gives different results compared to tensorflow python inference TensorRT	7	1373	October 5, 2021
Tensorrt 8.6 GA : C++ Inference gives diffrence results compared to onnx \|\| pt model python inference TensorRT	3	650	September 20, 2023
Inference of model using tensorflow/onnxruntime and TensorRT gives different result Jetson TX2 tensorrt	20	2554	October 18, 2021
Output from ONNX inference and trt inference are different Jetson TX2 tensorrt , tensorflow , nvbugs	6	839	October 18, 2021
Tensorrt C++ not working as python version and gives wrong results TensorRT	4	515	August 7, 2023
TensorRT with ONNX model and RGB opencv data TensorRT tensorrt , opencv , cuda	6	3397	April 28, 2021
TensorRT 10.1: Different inference results of onnxruntime and tensorrt TensorRT	2	161	August 21, 2024
TensorRT C++ result was wired and changed everytime I do the same inference TensorRT	3	701	May 17, 2021
Error when converting from ONNX model to Tensorrt Jetson AGX Xavier tensorrt , cuda , onnx	3	732	August 29, 2023

Incorrect inference results - onnx/pytorch/tensorrt/c++ on Xavier AGX

Related topics