ONNX to TensorRT conversion (FP16 or FP32) results in integer outputs being mapped to near negative infinity (~2e-45)

rbisariya · June 1, 2022, 7:57pm

Description

I am converting a trained BERT-style transformer, trained with a multi-task objective, to ONNX (successfully) and then using the ONNXParser in TensorRT (8.2.5) on Nvidia T4, to build an engine (using Python API). Running Inference gives me an output but the outputs are all (varied in exact value) close to 2e-45. The output shape (1x512, …) * 6 is correct but the values in 4/6 (where the output is integer valued) is being given as very small decimal numbers. This happens on both FP16 as well as FP32. Finally, if I use the TensorRT Backend in ONNXRuntime, I get correct outputs.

Environment

TensorRT Version 8.2.5:
GPU Type: Nvidia T4:
Nvidia Driver Version: Latest (450.142.00):
CUDA Version 10.2:
CUDNN Version 7.6:
Operating System + Version:
Python Version (if applicable) 3.8.10:
TensorFlow Version (if applicable):
PyTorch Version (if applicable) 1.11:
Baremetal or Container AWS g4dn24xlarge, AWS Deep Learning AMI 2:

NOTE: The model is a BERT style Transformer Encoder with 1 input embeddings layer, 12 Transformer Layers, 12 attention heads, 768 hidden layer size, and 3 classification heads (768 x 4, 768 x 100k, 768x45)

Relevant Files

Attached below is a sample of the output. It is the output of 6 tensors.

[tensor([[[2.8026e-45, 4.2039e-45, 2.8026e-45, …, 2.8026e-45,
2.8026e-45, 2.8026e-45]],

tensor([[[0.6362, 0.5518, 0.4241, …, 0.4971, 0.3567, 0.6465]],

    ...,

tensor([[[1.2612e-44, 1.2612e-44, 1.4013e-44, …, 2.1019e-44,
1.6816e-44, 2.1019e-44]],

tensor([[[0.0162, 0.0495, 0.0873, …, 0.6802, 0.0134, 0.0517]],

tensor([[[1.9689e-40, 5.4349e-40, 3.5303e-40, …, 2.0048e-40,
1.7215e-40, 2.8329e-40]],

tensor([[[1., 0., 0., …, 0., 0., 0.]],

NVES · June 1, 2022, 8:37pm

Hi,
Request you to share the ONNX model and the script if not shared already so that we can assist you better.
Alongside you can try few things:

validating your model with the below snippet

check_model.py

import sys
import onnx
filename = yourONNXmodel
model = onnx.load(filename)
onnx.checker.check_model(model).
2) Try running your model with trtexec command.

In case you are still facing issue, request you to share the trtexec “”–verbose"" log for further debugging
Thanks!

rbisariya · June 2, 2022, 12:57am

So I have run the model fine using onnxruntime with tensorrt execution provider. In case you didn’t see this, my model compiles and runs fine. The issue is with the outputs, and that they are totally wrong (when using trtexec or tensorrt python api (creating and running an engine using onnx parser).I’ve attached a jupyter notebook showing how I can use it in both onnxruntime as well as tensorrt but the results are different. additionally I’ve attached an onnx file as drive link since its 1.7 gb. m27_no_sentindex_amp_fp16_dim2_dynamic_batch.onnx - Google Drive tensorrt_run.ipynb (2.8 MB)
Again, it RUNS fine, the RESULTS are just off (all of them are like the sample provided in OP)

spolisetty · June 6, 2022, 12:17pm

Hi,

We couldn’t run successfully the above script. We recommend you to please share with us the minimal issue repro script for better debugging.
Also, you can explore BERT samples here and make sure your script is correct.

Thank you.

Topic		Replies	Views
Error occurred while running the Tensorrt samples: [reformat.cpp::executeCutensor::385] TensorRT tensorrt	3	1194	December 12, 2023
Incorrect inference results after converting from ONNX to TRT with trtexec TensorRT tensorrt , python , onnx	4	1568	December 9, 2022
Int8 performance is less than fp16 TensorRT tensorrt	3	860	September 2, 2022
Wrong result in TensorRT, but it seems something is working correctly TensorRT tensorrt , onnx	9	2346	September 29, 2022
TensorRT gives diffent results than ONNX and Pytorch TensorRT	8	1552	September 28, 2023
ONNX Model Int64 Weights TensorRT	12	13247	February 17, 2024
Inference result gets worse when converting pytorch model to TensorRT model TensorRT pytorch	6	1137	January 19, 2022
Problem converting TensorFlow 2-> ONNX model to TensorRT Engine (efficientdet_d0) TensorRT	8	1394	November 17, 2022
Error Code 1: Cudnn (CUDNN_STATUS_EXECUTION_FAILED) TensorRT cuda	3	2184	May 31, 2022
Keras CRNN model conversion to tensorrt engine error TensorRT tensorrt , tensorflow , onnx	3	957	April 8, 2022

ONNX to TensorRT conversion (FP16 or FP32) results in integer outputs being mapped to near negative infinity (~2e-45)

Description

Environment

Relevant Files

check_model.py

Related topics