Having trouble converting Pytorch Faster-RCNN to TensorRT Engine

aditya.mishra7911 · August 30, 2022, 12:42am

Description

I run into some shape issues (with IShuffleLayer) when trying to run trtexec on my onnx model, which is a faster rcnn model provided by pytorch model zoo.

Environment

TensorRT Version: 8.4.3-1+cuda11.6
GPU Type: 1 Quadro RTX 6000
Nvidia Driver Version:
CUDA Version: 11.6
CUDNN Version: Running nvcc --version gives me 11.6
Operating System + Version: Ubuntu 18.02
Python Version (if applicable): 3.9.12
PyTorch Version (if applicable): 1.12.1+cu116
Baremetal or Container (if container which image + tag): baremetal

Relevant Files

Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)

Code to Reproduce Faster-RCNN Model and Export to ONNX Model (named faster_rcnn_base.onnx):
export_pytorch_onnx.py (1.6 KB)

Steps To Reproduce

Please include:

Ran export_pytorch_onnx.py to create Faster-RCNN and export it to ONNX format.
Ran /usr/src/tensorrt/bin/trtexec --onnx=faster_rcnn_base.onnx --saveEngine=faster_rcnn_base_engine.trt --verbose
Traceback can be found in this file:
original_traceback.txt (127.2 KB)

Some additional steps I tried using to fix:
I tried using polygraphy to fold constants outlined in another post, but I ran into another error with with IShuffleLayer.

Polygraphy I ran: polygraphy surgeon sanitize --fold-constants faster_rcnn_base.onnx -o faster_rcnn_base_sanitized.onnx
Traceback after running trtexec --onnx=faster_rcnn_base.onnx --saveEngine=faster_rcnn_base_engine.trt --verbose can be found in this file:
polygraphy_traceback.txt (307.8 KB)

NVES · August 30, 2022, 1:07am

Hi ,
We recommend you to check the supported features from the below link.

You can refer below link for all the supported operators list.
For unsupported operators, you need to create a custom plugin to support the operation

github.com

onnx/onnx-tensorrt/blob/main/docs/operators.md

<!--- SPDX-License-Identifier: Apache-2.0 -->

# Supported ONNX Operators

TensorRT 8.4 supports operators up to Opset 17. Latest information of ONNX operators can be found [here](https://github.com/onnx/onnx/blob/master/docs/Operators.md)

TensorRT supports the following ONNX data types: DOUBLE, FLOAT32, FLOAT16, INT8, and BOOL

> Note: There is limited support for INT32, INT64, and DOUBLE types. TensorRT will attempt to cast down INT64 to INT32 and DOUBLE down to FLOAT, clamping values to `+-INT_MAX` or `+-FLT_MAX` if necessary.

See below for the support matrix of ONNX operators in ONNX-TensorRT.

## Operator Support Matrix

| Operator                  | Supported  | Supported Types | Restrictions                                                                                                           |
|---------------------------|------------|-----------------|------------------------------------------------------------------------------------------------------------------------|
| Abs                       | Y          | FP32, FP16, INT32 |
| Acos                      | Y          | FP32, FP16 |
| Acosh                     | Y          | FP32, FP16 |
| Add                       | Y          | FP32, FP16, INT32 |

This file has been truncated. show original

Thanks!

spolisetty · September 1, 2022, 8:36am

Hi,

After we sanitize the model using Polygraphy, facing the following error.

[09/01/2022-08:30:24] [V] [TRT] Registering layer: Reshape_1330 for ONNX node: Reshape_1330
[09/01/2022-08:30:24] [E] Error[4]: [graphShapeAnalyzer.cpp::analyzeShapes::1294] Error Code 4: Miscellaneous (IShuffleLayer Reshape_1330: reshape changes volume. Reshaping [946181376] to [1,4624].)
[09/01/2022-08:30:24] [E] [TRT] parsers/onnx/ModelImporter.cpp:773: While parsing node number 357 [Reshape -> "onnx::Sigmoid_2547"]:
[09/01/2022-08:30:24] [E] [TRT] parsers/onnx/ModelImporter.cpp:774: --- Begin node ---
[09/01/2022-08:30:24] [E] [TRT] parsers/onnx/ModelImporter.cpp:775: input: "onnx::Reshape_2545"
input: "onnx::Reshape_2546"
output: "onnx::Sigmoid_2547"
name: "Reshape_1330"
op_type: "Reshape"

[09/01/2022-08:30:24] [E] [TRT] parsers/onnx/ModelImporter.cpp:776: --- End node ---
[09/01/2022-08:30:24] [E] [TRT] parsers/onnx/ModelImporter.cpp:778: ERROR: parsers/onnx/ModelImporter.cpp:180 In function parseGraph:
[6] Invalid Node - Reshape_1330
[graphShapeAnalyzer.cpp::analyzeShapes::1294] Error Code 4: Miscellaneous (IShuffleLayer Reshape_1330: reshape changes volume. Reshaping [946181376] to [1,4624].)
[09/01/2022-08:30:24] [E] Failed to parse onnx file
[09/01/2022-08:30:24] [I] Finish parsing network model
[09/01/2022-08:30:24] [E] Parsing model failed
[09/01/2022-08:30:24] [E] Failed to create engine from model or file.
[09/01/2022-08:30:24] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v8401] # trtexec --onnx=faster_rcnn_base_sanitized.onnx --verbose --workspace=10000

Please refer to the following similar issue and fix your onnx model.

github.com/NVIDIA/TensorRT

Error Code 4: Miscellaneous (IShuffleLayer Reshape_427: reshape changes volume. Reshaping [900,1,256] to [900,7200,32].)

opened 07:07AM - 15 Aug 22 UTC

closed 02:09AM - 06 Dec 22 UTC

liangguixing95

triaged

hello, when i coverted my onnx model to TensorRT by the command, `./trtexec --o…nnx=model.onnx --saveEngine=model.engine` i got big diff between pytorch result and trt result. i located the problem which might be related to the decoder transformer part of my model. so i only coverted the transformer part to onnx and try to find out what is wrong. but when i run the command `./trtexec --onnx=decoder_transformer.onnx --saveEngine=decoder_transformer.engine`to covert onnx to trt. i got an error which didn't appear during the "model.onnx" converting. <img width="731" alt="error" src="https://user-images.githubusercontent.com/56212446/184589569-84f0c06e-a000-43d9-b335-138787f4f7f1.png"> The error comes from the cross attention part. but the error disappears when i only covert the cross attention module to onnx and trt by `./trtexec --onnx=cross_attention.onnx --saveEngine=cross_attention.engine`. so finally i can not figure out how to solve the problem to get correct trt result and open a issue for some help. Thanks~ **Environment** TensorRT Version: 8.4.1.5+cuda11.6 NVIDIA GPU: A100 NVIDIA Driver Version: 510.47.03 CUDA Version: 11.6 CUDNN Version: 8.4.0.27 Operating System: Ubuntu 20.04.2 LTS Python Version: 3.7.13 PyTorch Version: 1.10

Thank you.

aditya.mishra7911 · September 7, 2022, 7:02pm

Thanks for your reply. How exactly do I modify the Reshape op? Is there a way to modify the onnx graph itself?

I’m using the torchvision models provided by pytorch, so I think all the reshape ops are using -1 in the first axis.

spolisetty · September 13, 2022, 5:10pm

Hi,

Please checkout ONNX GraphSurgeon — ONNX GraphSurgeon 0.3.16 documentation