[TensorRT] Converted ONNX model inference error


Direct conversion from torch to tensorrt using torch2trt failed with error:

Warning: Encountered known unsupported method torch.nn.functional.interpolate [TensorRT] ERROR: (Unnamed Layer* 40) [Concatenation]: all concat input tensors must have the same number of dimensions, but mismatch at input 1. Input 0 shape: [1,128,14,14], Input 1 shape: [256,14,14] Traceback (most recent call last): File "demo_det.py", line 17, in <module> model_trt = torch2trt(net, [x]) File "/home/vision/project/zqy_pld/torch2trt/torch2trt.py", line 377, in torch2trt outputs = module(*inputs) File "/home/vision/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in __call__ result = self.forward(*input, **kwargs) File "/home/vision/project/zqy_pld/jobs/det/models.py", line 248, in forward x = module(x) File "/home/vision/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in __call__ result = self.forward(*input, **kwargs) File "/home/vision/.local/lib/python3.6/site-packages/torch/nn/modules/container.py", line 92, in forward input = module(input) File "/home/vision/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in __call__ result = self.forward(*input, **kwargs) File "/home/vision/project/zqy_pld/torch2trt/torch2trt.py", line 202, in wrapper converter['converter'](ctx) File "/home/vision/project/zqy_pld/torch2trt/converters/Conv2d.py", line 9, in convert_Conv2d input_trt = trt_(ctx.network, input) File "/home/vision/project/zqy_pld/torch2trt/torch2trt.py", line 116, in trt_ num_dim = len(t._trt.shape) # non-leaf tensors must already have _trt, get shape from that ValueError: __len__() should return >= 0

Then I tried to convert torch to onnx and then tensorrt. The conversion was successful and onnx output is the same as torch.
But when I do inference, tensorrt raises this error:

[TensorRT] ERROR: Parameter check failed at: engine.cpp::enqueue::273, condition: bindings[x] != nullptr

This is my inference code:

test_image = np.ones((1, 3, 256, 512))
input = test_image.astype(np.float32)
output = np.zeros(12288, dtype=np.float32)
d_input = cuda.mem_alloc(1572864)
d_output = cuda.mem_alloc(49152)
stream = cuda.Stream()
context = engine.create_execution_context()
cuda.memcpy_htod_async(d_input, input, stream)
context.execute_async(bindings=[int(d_input), int(d_output)], stream_handle=stream.handle)
cuda.memcpy_dtoh_async(output, d_output, stream)

The output shape in torch is 2x3x1x1x32x64. I understand these codes are for single output inference, but I cannot find any tutorial for multiple output. And I’m not sure if this is the problem with tensorrt engine or with my inference code.

Any help is appreciated. Thanks.


TensorRT Version:
GPU Type: P4000
Nvidia Driver Version: 410.48
CUDA Version: 10.0
CUDNN Version: 7.6.5
Operating System + Version: Ubuntu 18.04
Python Version (if applicable): 3.6.9
PyTorch Version (if applicable): 1.2.0

Relevant Files

Network definition is at: https://github.com/koyeongmin/PINet/blob/master/hourglass_network.py

Solved by changing the inference code according to the output shape.

But got one new warning:
[TensorRT] WARNING: Explicit batch network detected and batch size specified, use enqueue without batch size instead.

Any idea how to suppress this warning?

In this case, you can use enqueueV2 instead of enqueue.


Thank you for your reply.

In order to replace enqueue with enqueueV2, should I surgically modify the tensorrt engine or the original onnx model? And is there any tutorial on this?

Many thanks!

Please refer to below sample as reference: