Big difference between infer results of onnxruntime and tensorrt

Description

infer my onnx model using tensorrt resulting in bad outputs, while infer using onnxruntime is good for the same onnx model and inputs

Environment

TensorRT Version: 10.8
GPU Type: GTX3090
Nvidia Driver Version: 525.105
CUDA Version: 12.0.1
CUDNN Version: 8
Operating System + Version: ubuntu2204 docker on ubuntu1804 host
Python Version (if applicable): 3.10.12
TensorFlow Version (if applicable): N/A
PyTorch Version (if applicable): N/A
Baremetal or Container (if container which image + tag): nvidia/cuda:12.0.1-cudnn8-devel-ubuntu22.04

Relevant Files

my_sample.zip (25.8 MB)

Steps To Reproduce

Download the attached my_sample.zip, unzip it under TensorRTOSS10.8/samples/python/
(the onnx model needs passwd to unzip: wrongtensorrtresult)

the network in onnx is an object detection network which output a heatmap representing the center point of objects, the test.npy is saved tensor after pre-processing some image.

 `python run_onnx.py hm.onnx test.npy`

it will save two png outputs, you can see the trt output is different from onnx output. the onnx output png is correct as it shows 3 dots which represents 3 objects, while the trt output shows all black

`polygraphy run hm.onnx --trt --onnxrt --atol 0.001 --rtol 0.001`

or you can add “–data-loader-script data_loader.py” which reads test.npy as input

you can see the compare fails with large difference

Hi @vesor731ji ,

I am trying to repro and share this with Engineering to get next updates on the same.

Thanks

any updates?