Cannot export the Detectron2 onnx model in order to use on TensorRT on TX2

Dear all,

No longer ago, I asked a topic about Detectron2 on TensorRT although the working environment was on Windows system.

Currently, I have reproduced the issue on my TX2 Jetson device. Detectron2 is quite popular nowadays that it represents one of SOTA techniques. I wish that this issue can be paid attention because I believe many people wanna use the Detectron2 on TensorRT in Jetson devices as well.


Reproduce my issue:

Here I will show what I have tried the success part that it only includes Backbone+FPN part.

  1. Git clone Detectron2 and install setup.py (I was installing this version. Here)
  2. Go to detectron2/tools folder
  3. Download test_detect.py (The document is here.) and put in (detectron2/tools) folder
  4. Open command line/terminal to (detectron2/tools) folder, and type python3 test_detect.py

NOTE:
Check this : https://gist.github.com/chiehpower/7d2a598c9c2b6bef96a525c2f93ae927#file-test_detect-py-L172-L173

Please check the line 172 and line 173 below:

 dummy_convert(cfg, only_backbone = True) # only backbone + FPN
 dummy_convert(cfg, only_backbone = False) # all

If only_backbone = True , you can convert it successfully that only with backbone + FPN.
However, if only_backbone = False , it means including whole model that it will get wrong.


Error message:

Here is my entire output messages.

Traceback (most recent call last):
  File "test_detect.py", line 173, in <module>
    dummy_convert(cfg, only_backbone = False) # all
  File "test_detect.py", line 152, in dummy_convert
    export_params=True
  File "/usr/local/lib/python3.6/dist-packages/torch/onnx/__init__.py", line 143, in export
    strip_doc_string, dynamic_axes, keep_initializers_as_inputs)
  File "/usr/local/lib/python3.6/dist-packages/torch/onnx/utils.py", line 66, in export
    dynamic_axes=dynamic_axes, keep_initializers_as_inputs=keep_initializers_as_inputs)
  File "/usr/local/lib/python3.6/dist-packages/torch/onnx/utils.py", line 382, in _export
    fixed_batch_size=fixed_batch_size)
  File "/usr/local/lib/python3.6/dist-packages/torch/onnx/utils.py", line 249, in _model_to_graph
    graph, torch_out = _trace_and_get_graph_from_model(model, args, training)
  File "/usr/local/lib/python3.6/dist-packages/torch/onnx/utils.py", line 206, in _trace_and_get_graph_from_model
    trace, torch_out, inputs_states = torch.jit.get_trace_graph(model, args, _force_outplace=True, _return_inputs_states=True)
  File "/usr/local/lib/python3.6/dist-packages/torch/jit/__init__.py", line 275, in get_trace_graph
    return LegacyTracedModule(f, _force_outplace, return_inputs, _return_inputs_states)(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/torch/jit/__init__.py", line 355, in forward
    out_vars, _ = _flatten(out)
RuntimeError: Only tuples, lists and Variables supported as JIT inputs/outputs. Dictionaries and strings are also accepted but their usage is not recommended. But got unsupported type Instances

We knew if we wanna use the model on TensorRT that we have to export the onnx model then converting onnx model to TensorRT engine. However, there are many functions in Detectron2 which were written by Python class so that we cannot export the model to onnx model because of Python class issue.

Thank you.


Environment information

  • Pytorch : 1.3.0
  • Detectron2 : 0.1.1
  • CUDA : 10.0
  • Python : 3.6.9
  • JetPack version : 4.3
  • cuDNN version : 7.6.3
  • cmake version : 3.10.2
  • onnxruntime-gpu-tensorrt : 1.0.0

Hi,

Do you have any dependencies on JetPack4.3?
If not, we are recommended to upgrade your device to JetPack4.4 first.

Thanks.

Hi @AastaLLL,

Thanks for your advice.

The point is that I have worked on JetPack v4.3 for a long time and it means very complete environment setting. There are many libraries and tools on Jetson device which are requested to build from source. I am afraid if I upgrade, other libraries will not work.

In addition, I don’t think so this issue which can be resolved by upgrading JetPack version. I wonder is there any critical reason that we should upgrade to JetPack v4.4rather than working on v4.3?

Thank you so much.

Hi,

Thanks for your reply. We are checking this issue now.

In general, you can get more support with latest TensorRT 7.1.
But we can stay in JetPack4.3 to narrow down the issue first.

Thanks.

Hi AastaLLL,

Thanks for your information. Actually, I think that you can still try in JetPack 4.4 because this issue was happening before we generated the TensorRT model.

Thank you.

Hi,

We can reproduce this in our environment.
Based on the error log, pyTorch complain about the data type of outs.

RuntimeError: Only tuples, lists and Variables supported as JIT inputs/outputs. Dictionaries and strings are also accepted but their usage is not recommended. But got unsupported type Instances

Here is the detail information about outs, which is a instance but is expected to be a tuples/lists/variables.

[[{'instances': Instances(num_instances=0, image_height=800, image_width=800, fields=[pred_boxes: Boxes(tensor([], device='cuda:0', size=(tensor(0), tensor(4)),
       grad_fn=<ViewBackward>)), scores: tensor([], device='cuda:0', grad_fn=<IndexBackward>), pred_classes: tensor([], device='cuda:0', dtype=torch.int64), pred_masks: tensor([], device='cuda:0', size=(tensor(0), tensor(800), tensor(800)),
       dtype=torch.uint8)])}]]

This issue looks like a user space problem.
Have you checked this script on other environment before?

Thanks.

Hi @AastaLLL

Thanks for your checking.

We knew this reason and we also tried this script in Windows as well that we got the same problem.
So that is what I mentioned the core point in the beginning, Python class issue. They used a lot of Python class in Detectron 2.

Would you like to advise me that according to your professional opinion, do you think this issue which could be solved by any chance? I mean using Detectron2 in TensorRT including fixing the Python class issue and exporting the Detectron2 to the onnx model.

Thank you so much.

Chieh

Hi,

Have you successfully executed this script from other environment before? Like different platform or older software?

The error indicates an input type mis-matching in the pyTorch backend frameworks.
In general, it’s caused by the user space app.

I checked the sample from their official website but can only find a single input example.
Could you help to confirm the usage in the test_detect.py is correct first?

This will help us figure out the cause of the error.
Thanks.

Dear @AastaLLL,
Nice to meet you, I am @Chieh colleague. Currently, we are trying to fix the issue on the onnx convertion.

Regarding your statement, I am sorry but, AFAIK, I cannot find it anywhere in the detectron2 github instead.

FYI, The way detectron2 run the inference/demo is by running:

python demo.py --config-file ....

And if you check in the demo.py, they use this as predictor:

And if you take a look at line 217 of DefaultPredictor, they use a dictionary of 3 inputs instead of a single input as what you explained before.

a. Could you help me pinpoint the single input input example?

b. And also, the crux of the issue is that ONNX will not be able to convert detectron2’s model directly without adding 3rd party modules (caffe).
Hence, as what was mentioned by @Chieh, if there is a way to bypass the convertion or there is a way to convert the detectron2 model to TensorRT, it will be much appreciated.

Thank you~

Hi,

Sorry that our previous comment might lead to some misunderstanding.
The single input example is for torch.onnx.export rather than detectron2.
https://pytorch.org/docs/stable/onnx.html

The reason why we suspect that is the error indicates a non-supported dict type.

Suppose this issue focus on how to convert detectron2 into onnx, or even TensorRT.
And there is no available example currently (please correct us if anything missing).

We are going to check this in detail.
But since we are not familiar with detectron2, this may take some time to get a progress.

Thanks.

1 Like

Hi @AastaLLL,

Actually, our final goal is to convert detectron2 into onnx in this topic.

To be honest, we also realized this issue which was a little bit complicated to achieve after we delved into this problem.

Thanks for your concern on this issue.
If you know any new information about this topic, I will be grateful if you can share for us.

Thanks.

Best regards,
Chieh

1 Like

Thanks.
Will keep you updated for this issue.

1 Like

Hello, @Chieh, @AastaLLL

I am also going to use Detectron 2.
Are there any further progress on this issue?

Thank you.

We still got stuck on the operation issue.

If you have any news or progress, welcome to update!

1 Like

Hi,

Sorry that we need more time for this due to the limited resource.

Thanks.