[TensorRT] Converted ONNX model inference error

Wenbin_Xu · May 6, 2020, 8:29am

Description

Direct conversion from torch to tensorrt using torch2trt failed with error:

Warning: Encountered known unsupported method torch.nn.functional.interpolate [TensorRT] ERROR: (Unnamed Layer* 40) [Concatenation]: all concat input tensors must have the same number of dimensions, but mismatch at input 1. Input 0 shape: [1,128,14,14], Input 1 shape: [256,14,14] Traceback (most recent call last): File "demo_det.py", line 17, in <module> model_trt = torch2trt(net, [x]) File "/home/vision/project/zqy_pld/torch2trt/torch2trt.py", line 377, in torch2trt outputs = module(*inputs) File "/home/vision/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in __call__ result = self.forward(*input, **kwargs) File "/home/vision/project/zqy_pld/jobs/det/models.py", line 248, in forward x = module(x) File "/home/vision/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in __call__ result = self.forward(*input, **kwargs) File "/home/vision/.local/lib/python3.6/site-packages/torch/nn/modules/container.py", line 92, in forward input = module(input) File "/home/vision/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in __call__ result = self.forward(*input, **kwargs) File "/home/vision/project/zqy_pld/torch2trt/torch2trt.py", line 202, in wrapper converter['converter'](ctx) File "/home/vision/project/zqy_pld/torch2trt/converters/Conv2d.py", line 9, in convert_Conv2d input_trt = trt_(ctx.network, input) File "/home/vision/project/zqy_pld/torch2trt/torch2trt.py", line 116, in trt_ num_dim = len(t._trt.shape) # non-leaf tensors must already have _trt, get shape from that ValueError: __len__() should return >= 0

Then I tried to convert torch to onnx and then tensorrt. The conversion was successful and onnx output is the same as torch.
But when I do inference, tensorrt raises this error:

[TensorRT] ERROR: Parameter check failed at: engine.cpp::enqueue::273, condition: bindings[x] != nullptr

This is my inference code:

test_image = np.ones((1, 3, 256, 512))
input = test_image.astype(np.float32)
output = np.zeros(12288, dtype=np.float32)
d_input = cuda.mem_alloc(1572864)
d_output = cuda.mem_alloc(49152)
stream = cuda.Stream()
context = engine.create_execution_context()
cuda.memcpy_htod_async(d_input, input, stream)
context.execute_async(bindings=[int(d_input), int(d_output)], stream_handle=stream.handle)
cuda.memcpy_dtoh_async(output, d_output, stream)
stream.synchronize()

The output shape in torch is 2x3x1x1x32x64. I understand these codes are for single output inference, but I cannot find any tutorial for multiple output. And I’m not sure if this is the problem with tensorrt engine or with my inference code.

Any help is appreciated. Thanks.

Environment

TensorRT Version: 6.0.1.5
GPU Type: P4000
Nvidia Driver Version: 410.48
CUDA Version: 10.0
CUDNN Version: 7.6.5
Operating System + Version: Ubuntu 18.04
Python Version (if applicable): 3.6.9
PyTorch Version (if applicable): 1.2.0

Relevant Files

Network definition is at: PINet/hourglass_network.py at master · koyeongmin/PINet · GitHub

Wenbin_Xu · May 6, 2020, 9:32am

Solved by changing the inference code according to the output shape.

But got one new warning:
[TensorRT] WARNING: Explicit batch network detected and batch size specified, use enqueue without batch size instead.

Any idea how to suppress this warning?

SunilJB · May 6, 2020, 10:59am

In this case, you can use enqueueV2 instead of enqueue.

Thanks

Wenbin_Xu · May 7, 2020, 12:54am

Thank you for your reply.

In order to replace enqueue with enqueueV2, should I surgically modify the tensorrt engine or the original onnx model? And is there any tutorial on this?

Many thanks!

SunilJB · May 7, 2020, 4:52am

Please refer to below sample as reference:

github.com

NVIDIA/TensorRT/blob/572d54f91791448c015e74a4f1d6923b77b79795/samples/opensource/sampleINT8API/sampleINT8API.cpp#L639


      
          }
          
          
// Create CUDA stream for the execution of this inference
          cudaStream_t stream;
          CHECK(cudaStreamCreate(&stream));
          
          
// Asynchronously copy data from host input buffers to device input buffers
          buffers.copyInputToDeviceAsync(stream);
          
          
// Asynchronously enqueue the inference work
          if (!context->enqueueV2(buffers.getDeviceBindings().data(), stream, nullptr))
          {
              return Logger::TestResult::kFAILED;
          }
          
          
// Asynchronously copy data from device output buffers to host output buffers
          buffers.copyOutputToHostAsync(stream);
          
          
// Wait for the work in the stream to complete
          cudaStreamSynchronize(stream);

Thanks

Topic		Replies	Views
Batch Inference Wrong in Python API TensorRT	15	3642	October 12, 2021
Invalid argument when calling tensorrt inference TensorRT tensorrt	1	1904	July 2, 2021
Pytorch model convert to TensorRT engine failed TensorRT tensorrt , pytorch , onnx	5	1285	December 28, 2020
I can't get result from TensorRT model TensorRT tensorrt	8	1082	May 31, 2022
Incorrect inference in TensorRT compared to the Tensorflow inference TensorRT tensorrt	3	802	March 10, 2022
Error while trying to onnx model file to trt engine TensorRT	5	2208	November 29, 2019
Error while converting Pytorch model to TensorRT TensorRT	5	1622	December 18, 2020
Convert onnx to engine fail on Tensorrt7.1.3.4 TensorRT	2	677	July 29, 2020
Inference error at engine.cpp::enqueue::293 TensorRT	4	2339	January 31, 2019
TensorRT Batch Inference: different results TensorRT	4	4347	December 1, 2021

[TensorRT] Converted ONNX model inference error

Description

Environment

Relevant Files

Related topics