Running a pytorch network converted to ONNX with TensorRT on the TX2

I am trying to run a pytorch neural network on the TX2 using TensorRT and I have been having problems at the stage of creating a tensorRT engine from the .onnx file.

For instance if I take the vgg16 network found here: https://github.com/pytorch/vision/blob/master/torchvision/models/vgg.py, and export it to the ONNX format like this:

import torch
import torchvision
mynetwork = torchvision.models.vgg.vgg16(pretrained=True)
input = torch.zeros((1, 3, 224, 224))
torch.onnx.export(mynetwork, input, "vgg16.onnx", verbose=True, input_names=["input"], output_names=["output"])

I get a file vgg16.onnx without any apparent problem. However when I feed it to TensorRT’s ONNX parser as is done in the sample, the program exits with the following message:

IndexError: Attribute not found: shape

It seems that this occurs during the call to

parser->convertToTRTNetwork()

Meanwhile, the sampleOnnxMNIST sample that came with tensorRT works just fine. Since the Readme for that sample reads as follows:

This sample demonstrates conversion of an MNIST network in ONNX format to
a TensorRT network. The network used in this sample can be found at https://github.com/onnx/models/tree/master/mnist
(model.onnx)

I went ahead and tried to run one of the model.onnx that is supplied there (there are 3, apparently for different ONNX opsets 1, 7 and 8), however none of them worked when used to create a tensorRT engine as in the sample. For instance, with the model.onnx with opset 7, I get the following:

[2019-02-21 02:03:24   ERROR] Parameter check failed at: ../builder/Network.cpp::addScale::112, condition: shift.count == 0 || shift.count == weightCount
python: onnx/converterToTRT.h:156: nvonnxparser::TRT_LayerOrWeights nvonnxparser::Converter::convert_node(const onnx::NodeProto*): Assertion `layer' failed.
Aborted (core dumped)

when calling parser->convertToTRTNetwork(). My guess is that the .onnx model supplied with the tensorRT samples is from previous versions of the ONNX models.

What ONNX formats / opsets are supported by tensorRT 4.0.2 on the TX2 and how can we obtain ONNX model that are compatible from pytorch models?

Some more information about my setup:
I installed tensorRT from the debian package tensorrt_4.0.2.0-1+cuda9.0_arm64.deb which I got from JetPack and I built and installed torch 1.1.0 from source on the TX2.

Hi,

Do you use this Github for the converter?
https://github.com/onnx/onnx-tensorrt

Thanks.

I am using torch.onnx.export() which comes with the torch installation, I will try with that repo and post the result

With the MNIST model (all 3 versions with the different opsets give the same result), I get the following with the onnx/onnx-tensorrt github:

----------------------------------------------------------------
Input filename:   model.onnx
ONNX IR version:  0.0.3
Opset version:    0
Producer name:    CNTK
Producer version: 2.4
Domain:           
Model version:    1
Doc string:       
----------------------------------------------------------------
Parsing model
Building TensorRT engine, FP16 available:1
    Max batch size:     32
    Max workspace size: 1024 MiB
onnx2trt: customWinogradConvActLayer.cpp:48: nvinfer1::cudnn::WinogradConvActLayer::WinogradConvActLayer(const string&, const EngineTensors&, const EngineTensors&, const nvinfer1::ConvolutionParameters&, bool, const std::vector<float>&): Assertion `matchNbDims(inputs[0], outputs[0]) && (inputs.size() == 1 || inputs[1].extent == outputs[0].extent)' failed.
Aborted (core dumped)

With the vgg16.onnx file that I obtained as in my original post, I get the following:

----------------------------------------------------------------
Input filename:   vgg16.onnx
ONNX IR version:  0.0.4
Opset version:    9
Producer name:    pytorch
Producer version: 0.4
Domain:           
Model version:    0
Doc string:       
----------------------------------------------------------------
WARNING: ONNX model has a newer ir_version (0.0.4) than this parser was built against (0.0.3).
Parsing model
Unsupported ONNX data type: INT64 (7)
[2019-02-21 20:20:54   ERROR] Parameter check failed at: ../builder/Layers.cpp::ConstantLayer::1108, condition: weights.type == DataType::kFLOAT || weights.type == DataType::kHALF || weights.type == DataType::kINT32
While parsing node number 33 [Gather -> "66"]:
ERROR: /home/brain/onnx-tensorrt/onnx2trt_utils.hpp:284 In function convert_axis:
[8] Assertion failed: axis >= 0 && axis < nbDims

It looks like the torch-supplied model uses some int64’s which I guess is a separate problem

Hi,

Thanks for your feedback.

We want to check this issue internally.
Could you share the onnx model you converted with us?

Thanks.

The model.onnx can be found here: https://onnxzoo.blob.core.windows.net/models/opset_8/mnist/mnist.tar.gz

For the vgg16.onnx model, I was able to modify the torchvision supplied model to fix this int64 error that I showed in my previous post, and the conversion to a .trt file is successful.

Any update on this issue, were you able to reproduce? I am seeing the same error that I was getting for the MNIST network above,

onnx2trt: customWinogradConvActLayer.cpp:48: nvinfer1::cudnn::WinogradConvActLayer::WinogradConvActLayer(const string&, const EngineTensors&, const EngineTensors&, const nvinfer1::ConvolutionParameters&, bool, const std::vector<float>&): Assertion `matchNbDims(inputs[0], outputs[0]) && (inputs.size() == 1 || inputs[1].extent == outputs[0].extent)' failed.
Aborted (core dumped)

for other networks as well.

Hello, I encounter the same problem (ALEXNET Model).
Use torch.onnx.export() to get an ONNX model, the I import the model with TensorRt (Python API). Then I get the same problem.
Have you tried to use this (onnx-tensorrt) tool (https://github.com/onnx/onnx-tensorrt)

Hi, simon472

Sorry for the late reply.

It looks like that the model in #6 is generated with opset8.
Could you re-generate it with opset7?

In our document, the supported IR for latest TensorRT is opset7. It may be even lower for the TensorRT4.0.
https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#api

We also test a ResNet-18 model shared in this page:
https://github.com/onnx/models/tree/master/models/image_classification/resnet

It can successfully be converted into TensorRT and launch without any issue.
Thanks.

Thank you for the follow-up. A version of that MNIST model with opset 7 can be found here:
https://onnxzoo.blob.core.windows.net/models/opset_7/mnist/mnist.tar.gz

Using this I get the same error as in post #4 above with onnx2trt.

Hi,

We also meet some error when converting this onnx model into TensorRT.
This issue is already passed to our internal team and will share more information with you later.

Thanks.

Hi,

I also met the same problem when converting a resnet50 onnx to tensorrt using https://github.com/onnx/onnx-tensorrt. The reason seems to be Gather operation is processed on the axis 0. I got the error “Assertion failed: axis >= 0 && axis < nbDims”. Then I found the source code in the project onnx-tensorrt, onnx2trt_utils.hpp line 272:

272 // Convert an ONNX axis into a TRT axis
273 inline Status convert_axis(int& axis, int nbDims)

but I can not figure the difference of axis in onnx and trt. Any information is appreciated.

Hi, 625240378

Could you check which opset version do you use first?
Currently, we only support opset-7 in TensorRT 5.0.

Thanks

Hi,
I created a tensorRT in torch.onnx.export(),parser->convertToTRTNetwork() to creating engine ,but I have the same problem as you.My The version I’m using is TensorRT4.0.How did you solve this problem,please?

Hi,

Would you mind to use TensorRT v5.0 first.
The opset-7 support is available until v5.0.

Thanks.

Hi,
I do experience the same issue as @625240378 using opset 7 and converting VGG16 :

2019-04-24 11:45:28,395 - INFO - tf2onnx.tfonnx: Using tensorflow=1.13.1, onnx=1.4.1, tf2onnx=1.5.0/1f8589
2019-04-24 11:45:28,395 - INFO - tf2onnx.tfonnx: Using opset <onnx, 7>
2019-04-24 11:45:31,010 - VERBOSE - tf2onnx.tfonnx: Summay Stats:
	tensorflow ops: Counter({'Const': 37, 'BiasAdd': 16, 'Relu': 15, 'Conv2D': 13, 'MaxPool': 5, 'MatMul': 3, 'Pack': 1, 'Prod': 1, 'Softmax': 1, 'Placeholder': 1, 'Shape': 1, 'Reshape': 1, 'StridedSlice': 1})
	tensorflow attr: Counter({'T': 58, 'dtype': 38, 'value': 37, 'data_format': 34, 'padding': 18, 'strides': 18, 'use_cudnn_on_gpu': 13, 'dilations': 13, 'ksize': 5, 'transpose_a': 3, 'transpose_b': 3, 'begin_mask': 1, 'N': 1, 'Index': 1, 'keep_dims': 1, 'end_mask': 1, 'shape': 1, 'new_axis_mask': 1, 'axis': 1, 'ellipsis_mask': 1, 'Tshape': 1, 'Tidx': 1, 'shrink_axis_mask': 1, 'out_type': 1})
	onnx mapped: Counter({'Const': 37, 'BiasAdd': 16, 'Relu': 15, 'Conv2D': 13, 'MaxPool': 5, 'MatMul': 3, 'Pack': 1, 'Prod': 1, 'Softmax': 1, 'Placeholder': 1, 'Shape': 1, 'Reshape': 1, 'StridedSlice': 1})
	onnx unmapped: Counter()

As you can see all the layers are mapped correctly to the ONNX model. But whenever I parse the ONNX model to TensorRT(I am using the latest one: 5.1.2.2) I do get the same error:

Description of the error: Assertion failed: axis >= 0 && axis < nbDims
Node where the error occurred: 34
Error code: UNSUPPORTED_NODE
Model was not parsed successfully

I have the same issues with ONNX to RT conversion. How do I install TensorRT v5.0 on Jetson. The latest Jetpack only supports TensorRT 4 if I am not mistaken.

JetPack 4.2 has TensorRT 5.0.6.3. You can install in to TX2.

Hi,

From your log:

Error code: UNSUPPORTED_NODE

Some layers inside your model are not supported by the TensorRT.
You can find our support matrix here: https://docs.nvidia.com/deeplearning/sdk/tensorrt-support-matrix/index.html

For a non-supported layer, you can add your own implementation as a plugin layer into TensorRT.
You can check our tutorial here: https://docs.nvidia.com/deeplearning/sdk/tensorrt-sample-support-guide/index.html#uffssd_sample

Thanks.

If I have only ONNX model, how will I be able to add unsupported layer? Is that means I can not import the ONNX file and need to add all my layers in C++?