Unable to convert torchvision model to TRT engine compatible with latest version of deepstream on Jetson

Please provide complete information as applicable to your setup.

**• Hardware Platform Jetson Xavier **
• DeepStream Version: 6.2
• JetPack Version: 5.1
• TensorRT Version
• Issue Type: Question/bug
• How to reproduce the issue ?
To reproduce, download the torchvision retinanet model, and export it as an onnx file ( I have tried every conversion method I can find, and all have this issue)
Then attempt to convert this file into a TRT engine (using TRTEXEC, deepstreams automatic conversion, torch2trt, etc…) on a Jetson Xavier running jetpack 5.1

I am attempting to convert the Torchvision RetinaNet model to a TRT engine which can be run on the jetson.
https://pytorch.org/vision/main/models/retinanet.html

Using torch.onnx.export and TRTEXEC
This results in error due to ConstantOfShape having different datatypes.

According to the onnx documentation onnx/Operators.md at main · onnx/onnx · GitHub ConstantOfShape only works with INT64 as input.

However, TRT will only work with ConstantOfShape of type FP32, as shown in the following documentation:

I have tried scripting or tracing the model before converting it, as well as using polygraphy surgeon to try and clean up the graph before conversion, with no effect.

Is there a way to convert and run this model? Otherwise this seems like a bug, as the requirements for the latest supported version of ONNX does not match TRTexecs capabilities.

which deepstream sample are you testing? could share the whole test logand the configuration file?

Hi,
At this point the deepstream sample and config is irrelevant, as I am still working on getting a TRT engine. I am using TRTexec to create this engine.

The main thing that is important for this being for deepstream on the Jetson is the versions of onnx, trt and CUDA that are available.

However, if you could provide some insight on the issues I am having with the conversion from Torchvision’s RetinaNet model to a TRT engine, that would be great.

Thank you!

Hey, which opset version did you use for onnx.export? could you try with opset=13?

Hi,
I was previously using opset=16, but I just tried with opset=13.
The error is pretty much the same, provided below:

[04/26/2023-12:52:46] [I] [TRT] ----------------------------------------------------------------
[04/26/2023-12:52:46] [I] [TRT] Input filename: exported_model.onnx
[04/26/2023-12:52:46] [I] [TRT] ONNX IR version: 0.0.7
[04/26/2023-12:52:46] [I] [TRT] Opset version: 13
[04/26/2023-12:52:46] [I] [TRT] Producer name: pytorch
[04/26/2023-12:52:46] [I] [TRT] Producer version: 2.0.0
[04/26/2023-12:52:46] [I] [TRT] Domain:
[04/26/2023-12:52:46] [I] [TRT] Model version: 0
[04/26/2023-12:52:46] [I] [TRT] Doc string:
[04/26/2023-12:52:46] [I] [TRT] ----------------------------------------------------------------
[04/26/2023-12:52:46] [W] [TRT] onnx2trt_utils.cpp:375: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[04/26/2023-12:52:47] [W] [TRT] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[04/26/2023-12:52:51] [E] Error[1]: [network.cpp::setWeightsName::3366] Error Code 1: Internal Error (Error: Weights of same values but of different types are used in the network!)
[04/26/2023-12:52:51] [E] [TRT] ModelImporter.cpp:726: While parsing node number 1120 [ConstantOfShape → “/anchor_generator/ConstantOfShape_output_0”]:
[04/26/2023-12:52:51] [E] [TRT] ModelImporter.cpp:727: — Begin node —
[04/26/2023-12:52:51] [E] [TRT] ModelImporter.cpp:728: input: “/anchor_generator/Constant_12_output_0”
output: “/anchor_generator/ConstantOfShape_output_0”
name: “/anchor_generator/ConstantOfShape”
op_type: “ConstantOfShape”
attribute {
name: “value”
t {
dims: 1
data_type: 7
raw_data: “\000\000\000\000\000\000\000\000”
}
type: TENSOR
}
doc_string: “/usr/local/lib/python3.8/dist-packages/torchvision-0.16.0a0+0d75d9e-py3.8-linux-aarch64.egg/torchvision/models/detection/anchor_utils.py(121): \n/usr/local/lib/python3.8/dist-packages/torchvision-0.16.0a0+0d75d9e-py3.8-linux-aarch64.egg/torchvision/models/detection/anchor_utils.py(119): forward\n/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py(1488): _slow_forward\n/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py(1501): _call_impl\n/usr/local/lib/python3.8/dist-packages/torchvision-0.16.0a0+0d75d9e-py3.8-linux-aarch64.egg/torchvision/models/detection/retinanet.py(636): forward\n/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py(1488): _slow_forward\n/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py(1501): _call_impl\n/usr/local/lib/python3.8/dist-packages/torch/jit/_trace.py(118): wrapper\n/usr/local/lib/python3.8/dist-packages/torch/jit/_trace.py(127): forward\n/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py(1501): _call_impl\n/usr/local/lib/python3.8/dist-packages/torch/jit/_trace.py(1260): _get_trace_graph\n/usr/local/lib/python3.8/dist-packages/torch/onnx/utils.py(893): _trace_and_get_graph_from_model\n/usr/local/lib/python3.8/dist-packages/torch/onnx/utils.py(989): _create_jit_graph\n/usr/local/lib/python3.8/dist-packages/torch/onnx/utils.py(1113): _model_to_graph\n/usr/local/lib/python3.8/dist-packages/torch/onnx/utils.py(1533): _export\n/usr/local/lib/python3.8/dist-packages/torch/onnx/utils.py(506): export\nconvert_pickle.py(37): \n”
[04/26/2023-12:52:51] [E] [TRT] ModelImporter.cpp:729: — End node —
[04/26/2023-12:52:51] [E] [TRT] ModelImporter.cpp:731: ERROR: ModelImporter.cpp:172 In function parseGraph:
[6] Invalid Node - /anchor_generator/ConstantOfShape
[network.cpp::setWeightsName::3366] Error Code 1: Internal Error (Error: Weights of same values but of different types are used in the network!)
[04/26/2023-12:52:51] [E] Failed to parse onnx file
[04/26/2023-12:52:51] [I] Finish parsing network model
[04/26/2023-12:52:51] [E] Parsing model failed
[04/26/2023-12:52:51] [E] Failed to create engine from model or file.
[04/26/2023-12:52:51] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v8502] # trtexec --onnx=exported_model.onnx --saveEngine=converted.engine

I tried to fold constants using below cmd, then the error is gone but the trtexec failed at topK whose K is not an initializer.
polygraphy surgeon sanitize --fold-constants -o RetinaNet_constantsfolded.onnx RetinaNet.onnx
You may try it and please make sure the onnxruntime is installed in your en.

Thank you for the suggestion.
I encounter the same error

topK whose K is not an initializer

after folding constants.

Do you know how to proceed from that point? I was unable to find any information on solving the topK error, so that seemed like another dead-end.

Sorry for the late reply, I was OOTO last week. It looks like the TopK operators are part of NMS, one way around that error is to replace those NMS ops with a NMS plugin.

Thank you, I will look into that.

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.