tensorrt's onnx parser can't parse the output layer correctly

Edwardf0t1 · March 7, 2019, 1:02am

Hello,

I converted a standard resnet-18 pytorch model to onnx model using:

model = ResNet18(args).cuda()

    print("pytorch to onnx")
        # Translate Pytorch Model into Onnx Model
    dummy_input = Variable(torch.randn(args.batch_size, args.input_channel, \
            args.input_size, args.input_size, device='cuda'))
    print("dummy input generated")
    output_names = ["output"]
    
    torch.onnx.export(model, dummy_input, args.onnx_model_name, verbose=False,
                      output_names=output_names)

Next, I checked the structure of the generated resnet18.onnx using:

# Load the ONNX model
model = onnx.load("resnet18.onnx")

# Check that the IR is well formed
onnx.checker.check_model(model)

# Print a human readable representation of the graph
onnx.helper.printable_graph(model.graph)

Noticed the information about the last a few layers:

%115[FLOAT, 512x512x3x3]
%116[FLOAT, 512]
%117[FLOAT, 512]
%118[FLOAT, 512]
%119[FLOAT, 512]
%120[INT64, scalar]
%121[FLOAT, 10x512]
%122[FLOAT, 10]

%188 = Padmode = ‘constant’, pads = [0, 0, 0, 0, 0, 0, 0, 0], value = 0
%189 = AveragePoolkernel_shape = [4, 4], pads = [0, 0, 0, 0], strides = [4, 4]
%190 = Constantvalue = <Scalar Tensor []>
%191 = Shape(%189)
%192 = Gather[axis = 0](%191, %190)
%193 = Constantvalue = <Scalar Tensor []>
%194 = Unsqueezeaxes = [0]
%195 = Unsqueezeaxes = [0]
%196 = Concat[axis = 0](%194, %195)
%197 = Reshape(%189, %196)
%output = Gemm[alpha = 1, beta = 1, transB = 1](%197, %121, %122)
return %output

However, when using tensorrt’s onnx parser to parse resnet18.onnx:

# The Onnx path is used for Onnx models.
def build_engine_onnx(model_file):
    with trt.Builder(TRT_LOGGER) as builder, builder.create_network() as network, trt.OnnxParser(network, TRT_LOGGER) as parser:
        builder.max_workspace_size = common.GiB(1)
        # Load the Onnx model and parse it in order to populate the TensorRT network.
        with open(model_file, 'rb') as model:
            parser.parse(model.read())
        print(network.get_layer(network.num_layers - 1).get_output(0).shape)
        network.mark_output(network.get_layer(network.num_layers - 1).get_output(0))  
        return builder.build_cuda_engine(network)

I got the shape of (512, 1, 1) rather than (10, 1, 1) for the output layer, I also checked the shape of the second last layer which seems correct: (512, 4, 4). So looks like the onnx parser cannot parse the last layer correctly.

I found similar issue has been reported, but in my case this is a standard resnet18 model with input shape (3,32,32), can anyone help?

Thanks.

NVES · March 7, 2019, 4:12pm

Hello,

I’m using resnet-18 from onnx zoo: https://s3.amazonaws.com/onnx-model-zoo/resnet/resnet18v2/resnet18v2.onnx, and trt seems to generate the correct output layer. In this case 1x1000.

can you share your onnx model?

Edwardf0t1 · March 7, 2019, 8:11pm

Hi,

Thanks for the info. Just wonder how this resnet18v2.onnx is generated? Could you also share the script of generating this model?

My resnet18.onnx is converted from the pytorch model using the codes shown above. I have sent you the model via private message.

Thanks for the help.

NVES · March 8, 2019, 4:30pm

Hello, resnet18v2.onnx is from onnx model zoo
https://github.com/onnx/models/tree/master/models/image_classification/resnet

reviewing your model now.

NVES · March 8, 2019, 6:01pm

Ok, with your model, i’m seeing the following. Using TRT 5.0.2

root@24a6bbbcddf1:/home/scratch.zhenyih_sw/tensorrt/tensorrt.nves.git/tensorrt/model.conversion/onnx/src# python generate.py
DataType.FLOAT
Loading and serializing ONNX model...
Build engine...
(512, 1, 1)
[TensorRT] ERROR: Network must have at least one output
Traceback (most recent call last):
  File "generate.py", line 108, in <module>
    fn_engine = serialize(fn_onnx, batch_size, datatype, model_dir)
  File "generate.py", line 84, in serialize
    with build_engine_from_onnx(trt_logger, max_batch_size, max_workspace_size, datatype, fn_onnx, calibrator) as engine:
AttributeError: __exit__

Edwardf0t1 · March 8, 2019, 8:07pm

Hi,

Thanks for the info. But this model is pre-generated, i.e. there’s no script to generate the onnx model directly, instead, the model is defined in Gluon, and then converted to onnx. And the conversion script is not released:

“The conversion of the model to ONNX format is done using an internal converter which will be released soon”

So not very helpful to me…

Edwardf0t1 · March 8, 2019, 8:10pm

NVES:

Ok, with your model, i’m seeing the following. Using TRT 5.0.2

root@24a6bbbcddf1:/home/scratch.zhenyih_sw/tensorrt/tensorrt.nves.git/tensorrt/model.conversion/onnx/src# python generate.py
DataType.FLOAT
Loading and serializing ONNX model...
Build engine...
(512, 1, 1)
[TensorRT] ERROR: Network must have at least one output
Traceback (most recent call last):
  File "generate.py", line 108, in <module>
    fn_engine = serialize(fn_onnx, batch_size, datatype, model_dir)
  File "generate.py", line 84, in serialize
    with build_engine_from_onnx(trt_logger, max_batch_size, max_workspace_size, datatype, fn_onnx, calibrator) as engine:
AttributeError: __exit__

Yes, I have to manually mark the output layer using:

network.mark_output(network.get_layer(network.num_layers - 1).get_output(0))

But after doing this, the dimension of the output layer is not correct. It should be (10,1,1),
but I saw (512,1,1).

Edwardf0t1 · March 13, 2019, 9:36pm

Hello,

Just wonder if there’s any update?

Thank you.

NVES · April 1, 2019, 6:11pm

hello,

Per engineering, looks like the real error is hidden by the fact that the parser did not fully parse the network. We need to add a check around parser.parse to catch any actual errors, please see the updated code below:

# The Onnx path is used for Onnx models.
def build_engine_onnx(model_file):
    with trt.Builder(TRT_LOGGER) as builder, builder.create_network() as network, trt.OnnxParser(network, TRT_LOGGER) as parser:
        builder.max_workspace_size = common.GiB(1)
        # Load the Onnx model and parse it in order to populate the TensorRT network.
        with open(model_file, 'rb') as model:
            #parser.parse returns a bool, and we were not checking it originally.
            if not parser.parse(model.read()):
                print(parser.get_error(0))
        print(network.get_layer(network.num_layers - 1).get_output(0).shape)
        network.mark_output(network.get_layer(network.num_layers - 1).get_output(0))  
        return builder.build_cuda_engine(network)

Will return the fact that the network is trying to do a gather on Axis 0 which TRT does not support. We have an issue open upstream with Pytorch here:

ONNX graph exporter issues from TensorRT 5 perspective · Issue #16908 · pytorch/pytorch · GitHub.

Can you update your parsing code to see if you get the same result that our engineers got?

Edwardf0t1 · April 1, 2019, 8:44pm

NVES:

hello,

Per engineering, looks like the real error is hidden by the fact that the parser did not fully parse the network. We need to add a check around parser.parse to catch any actual errors, please see the updated code below:
# The Onnx path is used for Onnx models.
def build_engine_onnx(model_file):
    with trt.Builder(TRT_LOGGER) as builder, builder.create_network() as network, trt.OnnxParser(network, TRT_LOGGER) as parser:
        builder.max_workspace_size = common.GiB(1)
        # Load the Onnx model and parse it in order to populate the TensorRT network.
        with open(model_file, 'rb') as model:
            #parser.parse returns a bool, and we were not checking it originally.
            if not parser.parse(model.read()):
                print(parser.get_error(0))
        print(network.get_layer(network.num_layers - 1).get_output(0).shape)
        network.mark_output(network.get_layer(network.num_layers - 1).get_output(0))  
        return builder.build_cuda_engine(network)
Will return the fact that the network is trying to do a gather on Axis 0 which TRT does not support. We have an issue open upstream with Pytorch here:

ONNX graph exporter issues from TensorRT 5 perspective · Issue #16908 · pytorch/pytorch · GitHub.

Can you update your parsing code to see if you get the same result that our engineers got?

Hi, I just added the checker and got the following (print(parser.get_error(0))):

<tensorrt.tensorrt.ParserError object at 0x7f29b974fa40>

This is not very informative to me, and I did not see the same error message shown in the link you provided. What is the error message on your side?

NVES · April 1, 2019, 10:42pm

hello,

You can print what fields of the error you want to see by accessing the ParserError members:

https://docs.nvidia.com/deeplearning/sdk/tensorrt-api/python_api/parsers/Onnx/pyOnnx.html#tensorrt.ParserError

haiderasad · November 24, 2021, 9:59am

hi i have the same problem, my onnx has 34 layers but the trt.parser gives only the first 24 .Can you please tell what can be the issue? added your parser.get error line but no error is shown and engine is built successfully but output is of the 24th layer .

NVES · November 24, 2021, 12:09pm

Hi,
Request you to share the ONNX model and the script if not shared already so that we can assist you better.
Alongside you can try few things:

validating your model with the below snippet

check_model.py

import sys
import onnx
filename = yourONNXmodel
model = onnx.load(filename)
onnx.checker.check_model(model).
2) Try running your model with trtexec command.
https://github.com/NVIDIA/TensorRT/tree/master/samples/opensource/trtexec
In case you are still facing issue, request you to share the trtexec “”–verbose"" log for further debugging
Thanks!

Topic		Replies	Views
TensorRT's OnnxParser problem TensorRT tensorrt	6	2316	October 12, 2021
Running a pytorch network converted to ONNX with TensorRT on the TX2 Jetson TX2	24	8837	October 18, 2021
[TensorRT] ERROR: Network must have at least one output TensorRT tensorrt	29	2332	September 30, 2021
Batch Inference Wrong in Python API TensorRT	15	3544	October 12, 2021
LSTM ONNX to TensorRT mismatched outputs TensorRT tensorrt	3	938	September 29, 2022
Some PyTorch model with slicing operation fails on inference TensorRT tensorrt , pytorch , onnx , deepstream	2	1434	January 7, 2022
Having trouble converting Pytorch Faster-RCNN to TensorRT Engine TensorRT	4	1961	September 13, 2022
ONNX and tensorRT: ERROR: Network must have at least one output TensorRT	30	16795	October 6, 2020
Onnx -> tensorrt fp32 conversion performance degradation different outputs TensorRT tensorrt , pytorch , onnx	4	2013	November 29, 2022
Torch.onnx.export with dynamic size for craft TensorRT	6	3985	May 20, 2021

tensorrt's onnx parser can't parse the output layer correctly

check_model.py

Related topics