Conversion error of Mask RCNN ONNX model for different types weights

tfuru2 · May 9, 2023, 6:34am

Description

The Torchvision Mask RCNN model cannot be converted in TensorRT engine for the following error:

[05/09/2023-14:46:31] [E] [TRT] ModelImporter.cpp:731: ERROR: ModelImporter.cpp:172 In function parseGraph:
[6] Invalid Node - /rpn/anchor_generator/ConstantOfShape
[network.cpp::setWeightsName::3366] Error Code 1: Internal Error (Error: Weights of same values but of different types are used in the network!)

Please let me know on how to fix this issue.

Environment

TensorRT Version: 8.5.2.2
GPU Type: Jetson Orin Nano
Nvidia Driver Version: JetPack 5.1.1
CUDA Version: 11.4.19
CUDNN Version: 8.6.0

Operating System + Version: JetPack 5.1.1
Python Version (if applicable): 3.8
TensorFlow Version (if applicable):
PyTorch Version (if applicable): 2.0.0
Baremetal or Container (if container which image + tag): Baremetal

Relevant Files

maskrcnn.onnx
maskrcnn_pth2onnx.py (1.9 KB)

Steps To Reproduce

/usr/src/tensorrt/bin/trtexec --onnx=maskrcnn.onnx --saveEngine=maskrcnn.trt --workspace=2048 --exportProfile=profile.json --verbose

Steps to regenerate the ONNX model on x86 platform

Place the attached maskrcnn_pth2onnx.py in any user path.
Execute the following docker command at the path which contains maskrcnn_pth2onnx.py.

docker run -it --rm --gpus=all --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 --network=host -v ${PWD}:"/maskrcnn_test" -w "/maskrcnn_test" "nvcr.io/nvidia/pytorch:23.04-py3" python maskrcnn_pth2onnx.py

Steps to reproduce the issue on x86 platform

I’d like to have a TensorRT engine for Jetson Orin. But you can reproduce the issue on x86 platform too.

docker run -it --rm --gpus=all --network=host -v ${PWD}:"/maskrcnn_test" -w "/maskrcnn_test" "nvcr.io/nvidia/tensorrt:23.01-py3" trtexec --onnx=maskrcnn.onnx --saveEngine=maskrcnn.trt --workspace=2048 --exportProfile=profile.json --verbose

AakankshaS · May 10, 2023, 11:08am

Hi,
Request you to share the ONNX model and the script if not shared already so that we can assist you better.
Alongside you can try few things:

validating your model with the below snippet

check_model.py

import sys
import onnx
filename = yourONNXmodel
model = onnx.load(filename)
onnx.checker.check_model(model).
2) Try running your model with trtexec command.

In case you are still facing issue, request you to share the trtexec “”–verbose"" log for further debugging
Thanks!

tfuru2 · May 10, 2023, 10:21pm

@AakankshaS
Thank you for your reply.
I have already attached the ONNX model ‘maskrcnn.onnx’ at the 1st post.
I have validated the model with the way you suggested, please find the code in maskrcnn_pth2onnx.py which also attached at the 1st post.

trild-vietnam · May 14, 2023, 1:15pm

Hello. How are things going for this issue?

tfuru2 · May 15, 2023, 6:52am

The error message I had, said “Weights of same values but of different types are used in the network!”, so I suspected that the “value” attribute of /rpn/anchor_generator/ConstantOfShape was invalid, because the attribute type was INT64 but TensorRT is supporting FP32 only. Then I removed the “value” attribute with onnx-graphsurgeon so that TensorRT will treat it as FP32 according to the ONNX specification. But I still have the same error from trtexec.
Please let me know the meaning of the error message.

Python code to remove the “value” attribute from the /rpn/anchor_generator/ConstantOfShape node:
modify_onnx_model.py (1.4 KB)

trtexec log for the original model:
trtexec_log_org.txt (507.2 KB)

trtexec log for the modified model:
trtexec_log_mod.txt (507.2 KB)

Thanks.

spolisetty · May 30, 2023, 6:52am

Hi,

We tried the polygraphy tool, trtexec, and faced the following error. It looks like there is a problem with the model input.

polygraphy surgeon sanitize maskrcnn_modified.onnx --fold-constants --output model_folded.onnx

[05/30/2023-06:15:14] [V] [TRT] Registering layer: /roi_heads/Reshape for ONNX node: /roi_heads/Reshape
[05/30/2023-06:15:14] [E] Error[4]: [shapeContext.cpp::operator()::3602] Error Code 4: Shape Error (reshape wildcard -1 has infinite number of solutions or no solution. Reshaping [0,8] to [0,-1].)
[05/30/2023-06:15:14] [E] [TRT] ModelImporter.cpp:771: While parsing node number 633 [Reshape -> "/roi_heads/Reshape_output_0"]:
[05/30/2023-06:15:14] [E] [TRT] ModelImporter.cpp:772: --- Begin node ---
[05/30/2023-06:15:14] [E] [TRT] ModelImporter.cpp:773: input: "/roi_heads/box_predictor/bbox_pred/Gemm_output_0"
input: "/roi_heads/Concat_output_0"
output: "/roi_heads/Reshape_output_0"
name: "/roi_heads/Reshape"
op_type: "Reshape"
attribute {
  name: "allowzero"
  i: 0
  type: INT
}

[05/30/2023-06:15:14] [E] [TRT] ModelImporter.cpp:774: --- End node ---
[05/30/2023-06:15:14] [E] [TRT] ModelImporter.cpp:777: ERROR: ModelImporter.cpp:195 In function parseGraph:
[6] Invalid Node - /roi_heads/Reshape
[shapeContext.cpp::operator()::3602] Error Code 4: Shape Error (reshape wildcard -1 has infinite number of solutions or no solution. Reshaping [0,8] to [0,-1].)
[05/30/2023-06:15:14] [E] Failed to parse onnx file
[05/30/2023-06:15:14] [I] Finished parsing network model. Parse time: 0.473263
[05/30/2023-06:15:14] [E] Parsing model failed
[05/30/2023-06:15:14] [E] Failed to create engine from model or file.
[05/30/2023-06:15:14] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v8601] # trtexec --onnx=model_folded.onnx --verbose --workspace=20000

Please refer to the similar issues below.

github.com/NVIDIA/TensorRT

Error[4]: [graphShapeAnalyzer.cpp::nvinfer1::builder::`anonymous-namespace'::ShapeNodeRemover::analyzeShapes::1872] Error Code 4: Miscellaneous (IRecurrenceLayer p2o.LSTM.4: reshape wildcard -1 has infinite number of solutions or no solution. Reshaping [0,1,2,48] to [0,0,-1].)

opened 01:34PM - 10 May 23 UTC

Satellite23

triaged

When I try to use trt to convert my crnn model, a problem happened.How to solve …it? [05/10/2023-19:45:06] [I] TensorRT version: 8.5.1 [05/10/2023-19:45:07] [I] [TRT] [MemUsageChange] Init CUDA: CPU +383, GPU +0, now: CPU 6759, GPU 893 (MiB) [05/10/2023-19:45:08] [I] [TRT] [MemUsageChange] Init builder kernel library: CPU +193, GPU +74, now: CPU 7062, GPU 967 (MiB) [05/10/2023-19:45:08] [I] Start parsing network model [05/10/2023-19:45:08] [I] [TRT] ---------------------------------------------------------------- [05/10/2023-19:45:08] [I] [TRT] Input filename: D:/Dissertation/Vehicle-license-plates-recognition-master/Vehicle-license-plates-recognition-master/tools/inference/rec_onnx/model.onnx [05/10/2023-19:45:08] [I] [TRT] ONNX IR version: 0.0.8 [05/10/2023-19:45:08] [I] [TRT] Opset version: 10 [05/10/2023-19:45:08] [I] [TRT] Producer name: [05/10/2023-19:45:08] [I] [TRT] Producer version: [05/10/2023-19:45:08] [I] [TRT] Domain: [05/10/2023-19:45:08] [I] [TRT] Model version: 0 [05/10/2023-19:45:08] [I] [TRT] Doc string: [05/10/2023-19:45:08] [I] [TRT] ---------------------------------------------------------------- [05/10/2023-19:45:08] [W] [TRT] onnx2trt_utils.cpp:377: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. [05/10/2023-19:45:08] [I] Finish parsing network model [05/10/2023-19:45:08] [W] Dynamic dimensions required for input: x, but no shapes were provided. Automatically overriding shape to: 1x3x32x1 [05/10/2023-19:45:08] [E] Error[4]: [graphShapeAnalyzer.cpp::nvinfer1::builder::`anonymous-namespace'::ShapeNodeRemover::analyzeShapes::1872] Error Code 4: Miscellaneous (IRecurrenceLayer p2o.LSTM.4: reshape wildcard -1 has infinite number of solutions or no solution. Reshaping [0,1,2,48] to [0,0,-1].) [05/10/2023-19:45:08] [E] Error[2]: [builder.cpp::nvinfer1::builder::Builder::buildSerializedNetwork::751] Error Code 2: Internal Error (Assertion engine != nullptr failed. ) [05/10/2023-19:45:08] [E] Engine could not be created from network [05/10/2023-19:45:08] [E] Building engine failed [05/10/2023-19:45:08] [E] Failed to create engine from model or file. [05/10/2023-19:45:08] [E] Engine set up failed

github.com/NVIDIA/TensorRT

Shape Error (reshape wildcard -1 has infinite number of solutions or no solution. Reshaping [0,16] to [0,-1].)

opened 10:42AM - 22 Mar 23 UTC

yunus-kalkan

triaged

## Description I'm trying to convert the faster-rcnn.pt file to the onnx and …use it with onnxruntime-gpu and tensorrt provider. When I check the converted onnx with trtexec, I get an Shape Error. One of the node has [0,16] input and that node tries to reshape it [0,-1]. Is there any way to assign the shape of the node manually ? It is because of the tools versions that I used ? ## Environment **TensorRT Version**: 8.6 **NVIDIA GPU**: Quadro RTX 4000 **NVIDIA Driver Version**: 530.30.02 **CUDA Version**: 12.1 **CUDNN Version**: **Operating System**: Ubuntu 18.04 **Python Version (if applicable)**: 3.9 **Tensorflow Version (if applicable)**: **PyTorch Version (if applicable)**: 1.7.1 **Baremetal or Container (if so, version)**: ## Steps To Reproduce 1.Exported onnx from PyTorch weight file (faster-rcnn model) dynamic_axes = {"input":{0:"batch_size"}, "boxes":{0:"batch_size"}, "labels":{0:"batch_size"}, "scores":{0:"batch_size"}} torch.onnx.export( model.to("cuda"), dummy_input.to("cuda"), onnx_filename, verbose = True, opset_version = 12, input_names = input_names, output_names = output_names, dynamic_axes = dynamic_axes, do_constant_folding=True ) 2.Applied polygraphy surgeon polygraphy surgeon sanitize n.onnx --fold-constants --output n2.onnx 3. Run trtexec on new onnx file trtexec --onnx=n2.onnx --shapes=input:1x2x2024x2024 --verbose ERROR: Reshape node, infinite solution for Reshaping [0,16] to [0,-1]. ... [03/22/2023-10:30:08] [E] [TRT] ModelImporter.cpp:770: input: "2881" input: "2897" output: "2898" name: "Reshape_1605" op_type: "Reshape" [03/22/2023-10:30:08] [E] [TRT] ModelImporter.cpp:771: --- End node --- [03/22/2023-10:30:08] [E] [TRT] ModelImporter.cpp:774: ERROR: ModelImporter.cpp:195 In function parseGraph: [6] Invalid Node - Reshape_1605 [shapeContext.cpp::operator()::3546] Error Code 4: Shape Error (reshape wildcard -1 has infinite number of solutions or no solution. Reshaping [0,16] to [0,-1].) [03/22/2023-10:30:08] [E] Failed to parse onnx file [03/22/2023-10:30:08] [I] Finished parsing network model. Parse time: 0.284967 [03/22/2023-10:30:08] [E] Parsing model failed [03/22/2023-10:30:08] [E] Failed to create engine from model or file. [03/22/2023-10:30:08] [E] Engine set up failed &&&& FAILED TensorRT.trtexec [TensorRT v8600] # /usr/src/tensorrt/bin/trtexec --onnx=n2.onnx --shapes=input:1x2x2024x2024 --verbose

Thank you.

Topic		Replies	Views
Having trouble converting Pytorch Faster-RCNN to TensorRT Engine TensorRT	4	2032	September 13, 2022
[TensorRT] ERROR: Network must have at least one output TensorRT tensorrt	29	2396	September 30, 2021
Torchvision Faster RCNN failed to convert to TensorRT engine TensorRT tensorrt , ubuntu , python	3	1446	October 5, 2023
Gather node output wrong when converting with TensorRT TensorRT	23	1748	January 10, 2023
Error code 4 internal error unnamed layer TensorRT cudnn	9	1414	October 4, 2024
Conversion to tensorRT error . [graphShapeAnalyzer.cpp::throwIfError::1306] Error Code 9 TensorRT jetson-inference	10	4364	May 13, 2022
I am trying to convert the ONNX SSD mobilnet v2 model into TensorRT Engine. I am getting the below error Jetson AGX Xavier tensorrt , jetson	8	797	December 8, 2021
Use pre-trained object detection TF2 models with TensorRT ONNX TensorRT	9	1935	May 31, 2021
tensorrt's onnx parser can't parse the output layer correctly TensorRT	12	4854	November 24, 2021
Running a pytorch network converted to ONNX with TensorRT on the TX2 Jetson TX2	24	8887	October 18, 2021