Best way to convert PyTorch to TensorRT model

Description

I am trying understand the differences between the various ways to compile/export a PyTorch model to a TensorRT engine. I’m using PyTorch 2.2.

Background: My end goal is to export and use my detectron2 PyTorch trained model as a TensorRT .engine file in order to use it in NVIDIA Deepstream afterwards.

This got me into reading about TorchScript, torch.fx, torch.export, torch.compile, TorchDynamo with different backends e.g. torch_tensorrt which apparently cannot be serialized?, as well as the standalone torch_tensorrt project.

Since the model (mask2former with SWIN transformer backend) and the codebase includes complex code constructs and dynamic control flow, I’ve ruled out torch.fx and all tracing methods (please correct me if my thinking is wrong).

I’m now left with these questions:

  1. Should I first convert to TorchScript using torch.jit.script? Is it the only “easy” option due to graph breaks and wanting to use it outside Python runtime?
  2. Is torch.compile (TorchDynamo) with the PyTorch model as input suitable for my goal (eventually serializing to a TensorRT engine file for use in Deepstream), or should I first convert the model to TorchScript?
  3. After compiling the model with any of the above methods, my understanding is I still need to use torch_tensorrt to serialize the model. Is there another way?
  4. I’ve stumbled upon the torch2rt project but I’m not certain if it’s a better option.

Sorry for the long post and appreciate any help!

Environment

TensorRT Version: 8.4.1.6
GPU Type: RTX3090
Nvidia Driver Version : 550.54.15
CUDA Version: 12.1
CUDNN Version: 8.9.2
Operating System + Version: Ubuntu 22.04
Python Version (if applicable): 3.8
TensorFlow Version (if applicable): -
PyTorch Version (if applicable): 2.2
Baremetal or Container (if container which image + tag): -

1 Like

Hi @pompos2 ,
Request you to raise teh concern on Issues · pytorch/pytorch · GitHub
Thanks

Since this is not a bug we have raised the question here but received no reply.

This is both an NVIDIA issue and a PyTorch issue, and in my opinion it’s more related to NVIDIA.
What is the recommended way to obtain a TensorRT engine from a PyTorch model according to NVIDIA? There are multiple tools all implemented by NVIDIA (including the deprecated backends for torch compile…

Hi @pompos2 ,
One of the recommended and commonly used way is to convert pytorch model to onnx and then to trt engine.
Would you like to give this a try and update us if issue is still there.
Thanks

Converting to onnx using torch.onnx.export (i.e. using torchscript as a backend) is indeed what most tutorials from NVIDIA suggest. After PyTorch 2.0 torchscript seems to be an abandonned project and we’re moving towards dynamo. In practical terms converting any model that has some level of complexity (like a swin transformer) to a TensorRT engine is an impossible feat.

Taking mask2former as an example, which uses swin as a backbone, someone would have to either go all in converting it to torchscript and hoping it will get converted or resort to using the conversion of others (in this case from OpenMMLabs) gatekeeping the Pytorch → TensorRT conversion as an arcane spell.

As an example,

" torch.onnx.export is in maintenance mode and we don’t plan to add new operators/features or fix complex issues."

source

1 Like