Wrong result in TensorRT, but it seems something is working correctly

chulhoonjang · September 9, 2022, 6:19pm

Description

I am working on the tensorRT conversion for the onnx model of semantic segmentation on NVIDIA Xavier AGX dev kit.

I got the same result with the onnx model (not exact same with the pytorch model, but almost similar).

However, the result of the tensorRT model is weird. The output should place within 0-27, because it’s the output of argmax and there are 28 labels for segments. The result of the tensorRT model is almost 0. I use np.unique for the outputs to compare the values in ‘labels of segment’.

input image

output of the onnx model

labels of segment: [ 0 1 3 4 6 9 10 11 15 16 19 22]

output of the tensorRT model
labels of segment: [0.0e+00 1.4e-45 4.2e-45 5.6e-45 8.4e-45 1.3e-44 1.4e-44 1.5e-44 2.1e-44
2.2e-44 2.7e-44 3.1e-44]

Very interestingly, if I multiply the output of tensorRT model by 5e+45 and save as png. I could see this image.

What’s wrong with this? Please help me.

Environment

TensorRT Version: 8.0.1.6
GPU Type: tegra194
Nvidia Driver Version: NVIDIA Jetson AGX Xavier 16GB, Jetpack 4.6 [L44 32.6.1]
CUDA Version: 10.2.300
CUDNN Version: 8.2.1.32
Operating System + Version: Ubuntu 18.04
Python Version (if applicable): 3.6.9
TensorFlow Version (if applicable):
PyTorch Version (if applicable): 1.9
Baremetal or Container (if container which image + tag):

Relevant Files

You can download the model and scripts here

Steps To Reproduce

run the onnx model
python3 run_onnx.py
convert the onnx model to the tensorrt model
/usr/src/tensorrt/bin/trtexec --onnx=model.onnx --saveEngine=model.trt --explicitBatch --fp16
run the trt model
python3 run_trt.py

NVES · September 9, 2022, 6:38pm

Hi,
Request you to share the ONNX model and the script if not shared already so that we can assist you better.
Alongside you can try few things:

validating your model with the below snippet

check_model.py

import sys
import onnx
filename = yourONNXmodel
model = onnx.load(filename)
onnx.checker.check_model(model).
2) Try running your model with trtexec command.

In case you are still facing issue, request you to share the trtexec “”–verbose"" log for further debugging
Thanks!

chulhoonjang · September 13, 2022, 8:04pm

I already shared the onnx model in the relevant files. You can download the model and scripts with the link.

spolisetty · September 14, 2022, 2:54pm

Could you please grant access to the issue repro.

chulhoonjang · September 14, 2022, 3:07pm

I gave you the access. Please try again.

spolisetty · September 23, 2022, 7:15am

Hi,

We could reproduce the same behavior.
Please allow us some time to work on this.

Thank you.

chulhoonjang · September 23, 2022, 1:25pm

Good to hear you can reproduce the issue. Look forward to the update. Thank you.

spolisetty · September 29, 2022, 5:43am

Hi,

The problem here is in the script - it allocated a host buffer of type np.float32, rather than the datatype actually used in the engine output. The following patch fixes the problem for us.

diff --git a/run_trt.py b/run_trt.py
index add0030..5c645c0 100644
--- a/run_trt.py
+++ b/run_trt.py
@@ -23,7 +23,7 @@ with open('model.trt', 'rb') as f:
 # create buffer
 for binding in engine:
     size = trt.volume(engine.get_binding_shape(binding)) * engine.max_batch_size
-    host_mem = cuda.pagelocked_empty(shape=[size],dtype=np.float32)
+    host_mem = cuda.pagelocked_empty(shape=[size],dtype=trt.nptype(engine.get_binding_dtype(binding)))
     cuda_mem = cuda.mem_alloc(host_mem.nbytes)

Thank you.

chulhoonjang · September 29, 2022, 6:23pm

Thank you for the reply. I resolved the issue. For the output, the type was supposed to be np.int32. I appreciated it.

system · October 13, 2022, 6:23pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
ONNX Model and Tensorrt Engine gives different output TensorRT tensorrt , onnx	13	5737	June 29, 2022
TensorRT Engine gives incorrect inference output for segmentation model TensorRT	6	1451	October 12, 2021
Onnx -> tensorrt fp32 conversion performance degradation different outputs TensorRT tensorrt , pytorch , onnx	4	2295	November 29, 2022
TensorRT model giving constant output TensorRT deepstream	4	1491	November 30, 2021
I can't get result from TensorRT model TensorRT tensorrt	8	1154	May 31, 2022
TensorRT model inference result is not correctly TensorRT tensorrt , tensorflow , onnx	1	709	July 1, 2022
Incorrect inference results after converting from ONNX to TRT with trtexec TensorRT tensorrt , python , onnx	4	1737	December 9, 2022
TensorRT Segmentation output TensorRT tensorrt , cudnn , onnx	1	415	March 14, 2024
ONNX to TensorRT conversion (FP16 or FP32) results in integer outputs being mapped to near negative infinity (~2e-45) TensorRT tensorrt , cuda , onnx , aws , natural-language-processing-nlp , nlp	3	3597	June 6, 2022
Wrong Output from TensorRt Model converted from Onnx TensorRT	1	1191	December 16, 2019