Cannot Deploy PyTorch Model on DeepStream

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU)
Jetson Orin NX
• DeepStream Version
6.1.1
• JetPack Version (valid for Jetson only)
5.0.2
• TensorRT Version
8.4.1
• NVIDIA GPU Driver Version (valid for GPU only)
35.1.0
• CUDA version
CUDA 11.4
• Issue Type( questions, new requirements, bugs)
bugs
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
Hello,

I’m having trouble deploying a PyTorch model on DeepStream. I am using a Jetson Orin NX with Jetpack v. 5.0.2 (TensorRT v. 8.4.1 ) and DeepStream v. 6.1.1; I developed a retina_resnet50_fpn model in PyTorch v. 1.13 and exported it to an onnx model. However, I have not been able to get the onnx model to run on DeepStream; I also have not been able to convert the onnx model to a trt engine using trtexec.

I tried to resolve my issue by following the solution in IShuffleLayer applied to shape tensor must have 0 or 1 reshape dimensions: dimensions were [-1,2]) because I see the same error message after running trtexec. Unfortunately, I still could not convert my onnx model to a trt engine after running polygraph surgeon sanitize –fold-constants. I’ve tried using the DeepSteam container and using torch2trt and other PyTorch models (e.g., resnet50) with varying opset versions (11 and 13), but I have not had success so far. I would greatly appreciate your assistance. Thank you.

Please the see attached files for the complete error messages from DeepStream and trtexec. I’ve included the most relevant part of the error message from trtexec below for your convenience:

(Running trtexec)
[6] Invalid Node - /transform/Pad
[shuffleNode.cpp::symbolicExecute::392] Error Code 4: Internal Error (/transform/Reshape: IShuffleLayer applied to shape tensor must
have 0 or 1 reshape dimensions: dimensions were [-1,2])

deepstream_error.txt (2.9 KB)

trtexec_error.txt (41.3 KB)

Hello,
The failure happens in trtexec, let’s move this topic to TensorRT forum for better service, thanks.

Hi,

Are you facing the same error on latest TensorRT version 8.5.2?

Thank you.

Hello, I didn’t want to try TensorRT 8.5.2 because I’m using DeepStream 6.1.1; DeepStream 6.1.1 comes packaged with TensorRT 8.4.1. I’m not sure if an engine file created by TensorRT 8.5.2 would work with DeepStream 6.1.1. Please let me know if I am mistaken. Thank you.

Could you please share with us the minimal issue repro ONNX model here or via DM for better debugging.

Thank you

Hello,
Thank you for your help. The ONNX file is larger than the limit for uploaded files. However, you should be able to easily reproduce my error because none of the torchvision models have worked for me. In particular, a dummy retinanet model should give you the error in DeepStream and in TensorRT.

import torch
from torchvision.models.detection import retinanet_resnet50_fpn

model = retinanet_resnet50_fpn(weights=None, num_classes=12, backbone_weights=None).eval()

BATCH_SIZE = 10
dummy_input = torch.randn(BATCH_SIZE,3,1024,1376)
torch.onnx.export(model,dummy_input,"retinanet_test_pytorch.onnx")

Running trtexec on this onnx file gives me an error. Please let me know what you see. Thank you!

Hello everyone,
I feel I’ve learned more than I ever wanted to about TensorRT. Ultimately, TensorRT is finicky. It’s not guaranteed work with every PyTorch model because there are various network operations that TensorRT doesn’t know how to handle (e.g., bias=False, padding, views, etc.) You either have to find an implementation that is compatible with TensorRT or build your own PyTorch model. I was able to get this implementation of Yolo V. 1 working by setting bias=True; I also got this implementation of Retinanet to work. Good luck!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.