tensorrt's onnx parser can't parse the output layer correctly

Hello,

I converted a standard resnet-18 pytorch model to onnx model using:

model = ResNet18(args).cuda()

    print("pytorch to onnx")
        # Translate Pytorch Model into Onnx Model
    dummy_input = Variable(torch.randn(args.batch_size, args.input_channel, \
            args.input_size, args.input_size, device='cuda'))
    print("dummy input generated")
    output_names = ["output"]
    
    torch.onnx.export(model, dummy_input, args.onnx_model_name, verbose=False,
                      output_names=output_names)

Next, I checked the structure of the generated resnet18.onnx using:

# Load the ONNX model
model = onnx.load("resnet18.onnx")

# Check that the IR is well formed
onnx.checker.check_model(model)

# Print a human readable representation of the graph
onnx.helper.printable_graph(model.graph)

Noticed the information about the last a few layers:

%115[FLOAT, 512x512x3x3]
%116[FLOAT, 512]
%117[FLOAT, 512]
%118[FLOAT, 512]
%119[FLOAT, 512]
%120[INT64, scalar]
%121[FLOAT, 10x512]
%122[FLOAT, 10]

%188 = Padmode = ‘constant’, pads = [0, 0, 0, 0, 0, 0, 0, 0], value = 0
%189 = AveragePoolkernel_shape = [4, 4], pads = [0, 0, 0, 0], strides = [4, 4]
%190 = Constantvalue = <Scalar Tensor []>
%191 = Shape(%189)
%192 = Gather[axis = 0](%191, %190)
%193 = Constantvalue = <Scalar Tensor []>
%194 = Unsqueezeaxes = [0]
%195 = Unsqueezeaxes = [0]
%196 = Concat[axis = 0](%194, %195)
%197 = Reshape(%189, %196)
%output = Gemm[alpha = 1, beta = 1, transB = 1](%197, %121, %122)
return %output

However, when using tensorrt’s onnx parser to parse resnet18.onnx:

# The Onnx path is used for Onnx models.
def build_engine_onnx(model_file):
    with trt.Builder(TRT_LOGGER) as builder, builder.create_network() as network, trt.OnnxParser(network, TRT_LOGGER) as parser:
        builder.max_workspace_size = common.GiB(1)
        # Load the Onnx model and parse it in order to populate the TensorRT network.
        with open(model_file, 'rb') as model:
            parser.parse(model.read())
        print(network.get_layer(network.num_layers - 1).get_output(0).shape)
        network.mark_output(network.get_layer(network.num_layers - 1).get_output(0))  
        return builder.build_cuda_engine(network)

I got the shape of (512, 1, 1) rather than (10, 1, 1) for the output layer, I also checked the shape of the second last layer which seems correct: (512, 4, 4). So looks like the onnx parser cannot parse the last layer correctly.

I found similar issue has been reported, but in my case this is a standard resnet18 model with input shape (3,32,32), can anyone help?

Thanks.

Hello,

I’m using resnet-18 from onnx zoo: https://s3.amazonaws.com/onnx-model-zoo/resnet/resnet18v2/resnet18v2.onnx, and trt seems to generate the correct output layer. In this case 1x1000.

can you share your onnx model?

Hi,

Thanks for the info. Just wonder how this resnet18v2.onnx is generated? Could you also share the script of generating this model?

My resnet18.onnx is converted from the pytorch model using the codes shown above. I have sent you the model via private message.

Thanks for the help.

Hello, resnet18v2.onnx is from onnx model zoo
https://github.com/onnx/models/tree/master/models/image_classification/resnet

reviewing your model now.

Ok, with your model, i’m seeing the following. Using TRT 5.0.2

root@24a6bbbcddf1:/home/scratch.zhenyih_sw/tensorrt/tensorrt.nves.git/tensorrt/model.conversion/onnx/src# python generate.py
DataType.FLOAT
Loading and serializing ONNX model...
Build engine...
(512, 1, 1)
[TensorRT] ERROR: Network must have at least one output
Traceback (most recent call last):
  File "generate.py", line 108, in <module>
    fn_engine = serialize(fn_onnx, batch_size, datatype, model_dir)
  File "generate.py", line 84, in serialize
    with build_engine_from_onnx(trt_logger, max_batch_size, max_workspace_size, datatype, fn_onnx, calibrator) as engine:
AttributeError: __exit__

Hi,

Thanks for the info. But this model is pre-generated, i.e. there’s no script to generate the onnx model directly, instead, the model is defined in Gluon, and then converted to onnx. And the conversion script is not released:

“The conversion of the model to ONNX format is done using an internal converter which will be released soon”

So not very helpful to me…

Yes, I have to manually mark the output layer using:

network.mark_output(network.get_layer(network.num_layers - 1).get_output(0))

But after doing this, the dimension of the output layer is not correct. It should be (10,1,1),
but I saw (512,1,1).

Hello,

Just wonder if there’s any update?

Thank you.

hello,

Per engineering, looks like the real error is hidden by the fact that the parser did not fully parse the network. We need to add a check around parser.parse to catch any actual errors, please see the updated code below:

# The Onnx path is used for Onnx models.
def build_engine_onnx(model_file):
    with trt.Builder(TRT_LOGGER) as builder, builder.create_network() as network, trt.OnnxParser(network, TRT_LOGGER) as parser:
        builder.max_workspace_size = common.GiB(1)
        # Load the Onnx model and parse it in order to populate the TensorRT network.
        with open(model_file, 'rb') as model:
            #parser.parse returns a bool, and we were not checking it originally.
            if not parser.parse(model.read()):
                print(parser.get_error(0))
        print(network.get_layer(network.num_layers - 1).get_output(0).shape)
        network.mark_output(network.get_layer(network.num_layers - 1).get_output(0))  
        return builder.build_cuda_engine(network)

Will return the fact that the network is trying to do a gather on Axis 0 which TRT does not support. We have an issue open upstream with Pytorch here:

https://github.com/pytorch/pytorch/issues/16908.

Can you update your parsing code to see if you get the same result that our engineers got?

Hi, I just added the checker and got the following (print(parser.get_error(0))):

<tensorrt.tensorrt.ParserError object at 0x7f29b974fa40>

This is not very informative to me, and I did not see the same error message shown in the link you provided. What is the error message on your side?

hello,

You can print what fields of the error you want to see by accessing the ParserError members:

https://docs.nvidia.com/deeplearning/sdk/tensorrt-api/python_api/parsers/Onnx/pyOnnx.html#tensorrt.ParserError