The inference result of Conformer Encoder is wrong

yanz0920 · February 28, 2022, 2:54am

Description

I converted the conformer encoder model from pytorch to ONNX and then to TensorRT. However, I found that using the resulting TensorRT model inference results were wrong.

In order to figure out where the error occurred, I printed out the intermediate results, and finally located the location of the error. It’s operator like this in pytorch: y = x.eq(0.0), where x is type of torch.float32, and y is type of torch.bool which will be mask of torch.masked_fill() to limit the context window of the attention.

By the way, this error appears in TensorRT 8.0.1.6 but not in TensorRT 8.2.1.8. And TensorRT 8.0.1.6 is about 35% faster than TensorRT 8.2.1.8 when using the resulting TensorRT model.

Environment

TensorRT Version: 8.0.1.6
GPU Type: V100
Nvidia Driver Version: 455.32.00
CUDA Version: 11.1
CUDNN Version: 8.0.4
Operating System + Version: CentOS 7
Python Version (if applicable): 3.6
TensorFlow Version (if applicable):
PyTorch Version (if applicable): 1.8.1
Baremetal or Container (if container which image + tag):

Relevant Files

I tried to upload the ONNX model which is 233M. The model is too big to upload.

Steps To Reproduce

use torch.onnx.export() convert model from pytorch to ONNX
use trtexec convert model from ONNX to TensorRT
inference with resulting model in Python 3.6

NVES · February 28, 2022, 3:07am

Hi,
Can you try running your model with trtexec command, and share the “”–verbose"" log in case if the issue persist
https://github.com/NVIDIA/TensorRT/tree/master/samples/opensource/trtexec

You can refer below link for all the supported operators list, in case any operator is not supported you need to create a custom plugin to support that operation

github.com

onnx/onnx-tensorrt/blob/main/docs/operators.md

<!--- SPDX-License-Identifier: Apache-2.0 -->

# Supported ONNX Operators

TensorRT 8.4 supports operators up to Opset 17. Latest information of ONNX operators can be found [here](https://github.com/onnx/onnx/blob/master/docs/Operators.md)

TensorRT supports the following ONNX data types: DOUBLE, FLOAT32, FLOAT16, INT8, and BOOL

> Note: There is limited support for INT32, INT64, and DOUBLE types. TensorRT will attempt to cast down INT64 to INT32 and DOUBLE down to FLOAT, clamping values to `+-INT_MAX` or `+-FLT_MAX` if necessary.

See below for the support matrix of ONNX operators in ONNX-TensorRT.

## Operator Support Matrix

| Operator                  | Supported  | Supported Types | Restrictions                                                                                                           |
|---------------------------|------------|-----------------|------------------------------------------------------------------------------------------------------------------------|
| Abs                       | Y          | FP32, FP16, INT32 |
| Acos                      | Y          | FP32, FP16 |
| Acosh                     | Y          | FP32, FP16 |
| Add                       | Y          | FP32, FP16, INT32 |

This file has been truncated. show original

Also, request you to share your model and script if not shared already so that we can help you better.

Meanwhile, for some common errors and queries please refer to below link:

Thanks!

yanz0920 · February 28, 2022, 4:09am

trtinfer.init.log (22.8 KB)

the command of trtexec:
trtexec --verbose --loadEngine=$trt_model --shapes=$input_shape --loadInputs=$input_spec --exportOutput=trtinfer-$target.json

the results from trtexec are the same with TensorRT from python, and different with ONNX.

yanz0920 · February 28, 2022, 4:12am

There are no errors when convert model from pytorch to ONNX and then to TensorRT, so the ops used in pytorch are supported by TensorRT, right?

yanz0920 · February 28, 2022, 4:14am

I’d like to upload the model to you in ONNX, but the model is about 233M. It’s too big to upload!!!

spolisetty · February 28, 2022, 6:31am

Hi,

Could you please try on the latest TensorRT version 8.4 EA. If you still face this issue we recommend you to please share issue repro model, script to try from our end for better debugging.

Are you not facing this on the latest TRT version? We also recommend you to please verify onnx-runtime results as well to make sure they are correct.

Thank you.

yanz0920 · March 2, 2022, 2:07am

OK，I will try on theTensorRT version 8.4 EA.

Topic		Replies	Views
Incorrect inference results after converting from ONNX to TRT with trtexec TensorRT tensorrt , python , onnx	4	1508	December 9, 2022
nvonnxparser::IParse::parse() fail,and trt report paramenter check fail TensorRT tensorrt	7	1195	July 12, 2021
ONNX to TRT Engine conversion Error TensorRT tensorrt	8	3634	May 25, 2022
Unable to convert ONNX model to TensorRT TensorRT tensorrt , pytorch , onnx	6	3436	September 30, 2020
Converting onnx to trt: [8] No importer registered for op: OneHot TensorRT tensorrt	3	2886	February 2, 2021
Pytorch -> ONNX -> TensorRT inference with terrible accuracy (int64 clamped to int32) TensorRT cudnn	2	1170	January 23, 2024
Could not parse ONNX model from file TensorRT	9	3514	January 24, 2024
Converted model is broken if half precision with dynamic batch size and batch size is greater than 1 TensorRT	11	2246	October 18, 2024
Inference result gets worse when converting pytorch model to TensorRT model TensorRT pytorch	6	1064	January 19, 2022
Tensorrt Conversion from ONNX for keras-ocr models fails because of int32 input in intermediate layers TensorRT	1	714	March 8, 2023