Driver error when upgrade TensorRT 6 to TensorRT 7 using YoloV4 in INT8 mode

ray.chi · September 28, 2020, 3:59am

I can using my code on TensorRT6 in INT8 and TensorRT 7 in FP32, but I getting the error like the following image when I use TensorRT 7 in INT 8.

This is the verbose output.

Environment

TensorRT Version : 7.1.3
GPU Type : Jetson AGX iGPU
Nvidia Driver Version :
CUDA Version : 10.2
CUDNN Version : 8
Operating System + Version : Ubuntu 18.04

AastaLLL · September 28, 2020, 6:05am

Hi,

Here is a similar issue and it turns out to be an environment issue:

Would you mind to check if the suggestion helps for your use case also?
If the issue goes on, please share the detail steps with us for reproducing.

Thanks.

ray.chi · September 28, 2020, 9:17am

Hi~!

we’ve tried the way but still failed to solve the problem on Xavier NX. How could we diagnose the issue? We will attach our steps later. Thanks.

Ray

jackgao0323 · October 5, 2020, 8:04am

Hi @AastaLLL
I’m Ray’s coworker. I’m trying another way to solve this problem but I get other errors now. You can follow this post.

ERROR: builtin_op_importers.cpp:2179 In function importPad inputs.at(1).is_weights()

Thanks!

AastaLLL · October 19, 2020, 3:57am

Hi, both

Have you solved this issue with the new approach?
Thanks

jackgao0323 · October 19, 2020, 4:04am

No, still have same problem.

AastaLLL · October 21, 2020, 8:40am

Hi,

Would you mind sharing the detailed steps so we can reproduce this in our environment?
Thanks.

jackgao0323 · October 21, 2020, 8:55am

You can follow this reply. I have provided model and step in the following link.

Thanks

AastaLLL · November 4, 2020, 7:11am

Hi,

We try to reproduce this issue in our environment.
But cannot download the onnx file shared in this comment.

Could you make the model public so we can download it?

Thanks.

jackgao0323 · November 4, 2020, 7:33am

Hi,

I have sent you the download link by message.

Thanks.

AastaLLL · November 5, 2020, 5:33am

Thanks. We get the model successfully.
Will update here for any progress.

AastaLLL · November 6, 2020, 5:58am

Hi,

We can reproduce this issue in our environment.
Based on below topic, this issue can be avoided by setting the opset version into 9.

github.com/NVIDIA/TensorRT

Stand-alone pad operation fails with: Assertion failed: inputs.at(1).is_weights()

opened 05:14PM - 17 Mar 20 UTC

closed 05:41AM - 01 Jul 22 UTC

copaah

Component: ONNX Conversion: torch.onnx Release: 7.x triaged

## Description My current workflow is `pytorch -> onnx -> tensorrt` and I e…ncounter an issue with the `nn.ConstantPad2D` operation that results in the following error: ``` While parsing node number 23 [Pad -> "24"]: --- Begin node --- input: "input" input: "22" input: "23" output: "24" op_type: "Pad" attribute { name: "mode" s: "constant" type: STRING } --- End node --- ERROR: /mypath/onnx-tensorrt/builtin_op_importers.cpp:2106 In function importPad: [8] Assertion failed: inputs.at(1).is_weights() [03/12/2020-09:06:29] [E] Failed to parse onnx file [03/12/2020-09:06:29] [E] Parsing model failed [03/12/2020-09:06:29] [E] Engine creation failed [03/12/2020-09:06:29] [E] Engine set up failed ``` ## Environment OS: Ubuntu 18.04 torch: 1.4.0 onnx: 1.6.0 tensorrt: 7.0.0 cuda: 10.0 python: 2.7 ## Steps To Reproduce ```python # github_repro_example.py # ----------- import onnx import argparse import torch import torch.nn as nn class MinimalModel(nn.Module): def __init__(self): super(MinimalModel, self).__init__() self.constant_zero_pad = nn.ConstantPad2d((1, 0, 0, 0), 0) def forward(self, input_tensor): return self.constant_zero_pad(input_tensor) if __name__ == "__main__": parser = argparse.ArgumentParser(description='PSMNet') parser.add_argument('output_onnx') args = parser.parse_args() minimal_model = MinimalModel() minimal_model = nn.DataParallel(minimal_model) minimal_model.cuda() # Random deep feature input_tensor = torch.rand((1, 32, 128, 128)) # Check model can do a forward pass minimal_model(input_tensor) # Export to onnx torch.onnx.export( minimal_model.module, (input_tensor), args.output_onnx, export_params=True, verbose=True, training=False, opset_version=11 ) original_model = onnx.load(args.output_onnx) onnx.checker.check_model(original_model) ``` Run with: ``` python2 github_repro_example.py ./test.onnx ``` Run it through tensorrt ``` trtexec --explicitBatch --onnx ./test.onnx --verbose ``` Which will result in the above error. # Related issues https://github.com/onnx/onnx-tensorrt/issues/378 https://github.com/onnx/onnx-tensorrt/issues/411

Could you check if it helps first?
Thanks.

jack_gao · November 16, 2020, 5:47am

Hi,
Apologies for the late reply. I’ve been busy last week.
I have tried it but still got errors like the following picture.

Thanks.

AastaLLL · November 30, 2020, 6:09am

Hi,

The root cause is from the Pad_51 layer.

Please noted that the parameters of padding layer need to be a pre-defined constant rather than a tensor input.
Based on your model, the padding parameter is defined as tensor with some runtime calculation although the value is a constant.

A workaround for this is to calculate the tensor 226 value and replace it with a constant input.
This can be achieved by using our ONNX graphsurgeon API:

Thanks.

jack_gao · November 30, 2020, 7:10am

Hi,

Thanks for your reply. I’ll give it a try this week and tell you the result.

Thanks.

jack_gao · December 2, 2020, 7:06am

Hi,

I’m trying to follow your workaround but I get some problem. I think the workflow is using ONNX graphsurgeon to add a output to the “Cast_50” and using onnxruntime to get the “Cast_50” output. Then using onnx_graphsurgeon.Constant to replace it but I can’t add a output to the “Cast_50” because it only allow one output.
Is the workflow correct? If so, can you give me a sample or a hint to calculate the tensor 226 (Cast_50)?

Thanks!

AastaLLL · December 3, 2020, 7:39am

Hi,

You should replace the Cast_50 node with a constant node directly.
The parameter can be calculated like this:

(204 constant) [0, 1, 0, 1] → (207 gather) [0] → …

Here is a online visualized tool can help you calculate the output.

Thanks.

jack_gao · December 7, 2020, 8:01am

Hi,

I’ve calculated the Cast_50 output is [array([0, 0, 0, 0, 0, 0, 1, 1], dtype=int64)] and replaced tensor value with constant value and run it successfully by onnxruntime.
When I using trtexec to convert it from onnx to trt, I get a new error
[8] Assertion failed: mode == “constant” && value == 0.f && “This version of TensorRT only supports constant 0 padding!”.
It looks like the same error when setting the opset version into 9 (I’m using opset 11 now).

This is the model:
https://drive.google.com/file/d/1fG-2Y1fGZc5Gabw2dQzZjysi6SZumr-6/view?usp=sharing

The trtexec command is:
sudo ./trtexec --onnx=modified.onnx --verbose

Thanks.

AastaLLL · December 9, 2020, 4:17am

Hi,

The Drive doesn’t enable the permission.
Could you help to check it?

Thanks.

jack_gao · December 9, 2020, 6:59am

Hi,
Sorry. I’ve update the link.
https://drive.google.com/file/d/1fG-2Y1fGZc5Gabw2dQzZjysi6SZumr-6/view?usp=sharing

Thanks.

Topic		Replies	Views
Running a pytorch network converted to ONNX with TensorRT on the TX2 Jetson TX2	24	8878	October 18, 2021
Mod operator unsupported in TensorRT 8.4.1 (included w/ Jetpack 5.0.2) TensorRT jetpack , tensorrt , cuda , jetson-inference , onnx	5	1556	January 2, 2023
Assertion Error in buildMemGraph: 0 (mg.nodes[mg.regionIndices[outputRegion]].size == mg.nodes[mg.regionIndices[inputRegion]].size) TensorRT	10	1293	October 12, 2021
Inference error while using tensorrt engine on jetson nano Jetson Nano tensorrt , nvbugs	23	3614	April 20, 2022
I do not get any performance improvement after using TensorRT provider for object detection model Jetson Nano tensorrt , onnx	7	1410	July 12, 2022
Issues with torch.nn.ReflectionPad2d(padding) conversion to TRT engine TensorRT tensorrt , pytorch , onnx	21	4183	February 8, 2022
Tensorrt ONNX build engine ERROR Jetson Xavier NX tensorrt	5	1023	October 18, 2021
Errors with reading pb file in TensorRT and readNetFromTensorflow in C++ TensorRT	3	1238	January 26, 2021
TensorRT problem on NVIDIA APEX ORIN NX TensorRT tensorrt , jetson-inference , cudnn	1	38	August 29, 2024
I am trying to convert the ONNX SSD mobilnet v2 model into TensorRT Engine. I am getting the below error Jetson AGX Xavier tensorrt , jetson	8	794	December 8, 2021

Driver error when upgrade TensorRT 6 to TensorRT 7 using YoloV4 in INT8 mode

Environment

Related topics