Failed to convert quantized onnx model to engine

zhang.ga · July 11, 2024, 7:04am

Description

1、Using trtexec to convert original onnx model to engine is OK
2、Using trtexec to convert quantized onnx model which is verified by onnxruntim, error happened:

[07/11/2024-06:35:43] [V] [TRT] Importing initializer: head.obj_preds.2.bias_quantized_zero_point

[07/11/2024-06:35:43] [V] [TRT] Parsing node: head.cls_preds.0.bias_DequantizeLinear [DequantizeLinear]

[07/11/2024-06:35:43] [V] [TRT] Searching for input: head.cls_preds.0.bias_quantized

[07/11/2024-06:35:43] [V] [TRT] Searching for input: head.cls_preds.0.bias_quantized_scale

[07/11/2024-06:35:43] [V] [TRT] Searching for input: head.cls_preds.0.bias_quantized_zero_point

[07/11/2024-06:35:43] [V] [TRT] head.cls_preds.0.bias_DequantizeLinear [DequantizeLinear] inputs: [head.cls_preds.0.bias_quantized → (16)[INT32]], [head.cls_preds.0.bias_quantized_scale → (1)[FLOAT]], [head.cls_preds.0.bias_quantized_zero_point → (1)[INT32]],

[07/11/2024-06:35:43] [V] [TRT] Registering layer: head.cls_preds.0.bias_quantized for ONNX node: head.cls_preds.0.bias_quantized

[07/11/2024-06:35:43] [V] [TRT] Registering layer: head.cls_preds.0.bias_quantized_scale for ONNX node: head.cls_preds.0.bias_quantized_scale

[07/11/2024-06:35:43] [V] [TRT] Registering layer: head.cls_preds.0.bias_quantized_zero_point for ONNX node: head.cls_preds.0.bias_quantized_zero_point

[07/11/2024-06:35:43] [E] Error[3]: head.cls_preds.0.bias_DequantizeLinear: only activation types allowed as input to this layer.

[07/11/2024-06:35:43] [E] [TRT] ModelImporter.cpp:726: While parsing node number 0 [DequantizeLinear → “head.cls_preds.0.bias”]:

[07/11/2024-06:35:43] [E] [TRT] ModelImporter.cpp:727: — Begin node —

[07/11/2024-06:35:43] [E] [TRT] ModelImporter.cpp:728: input: “head.cls_preds.0.bias_quantized”

input: “head.cls_preds.0.bias_quantized_scale”

input: “head.cls_preds.0.bias_quantized_zero_point”

output: “head.cls_preds.0.bias”

name: “head.cls_preds.0.bias_DequantizeLinear”

op_type: “DequantizeLinear”

[07/11/2024-06:35:43] [E] [TRT] ModelImporter.cpp:729: — End node —

[07/11/2024-06:35:43] [E] [TRT] ModelImporter.cpp:731: ERROR: ModelImporter.cpp:185 In function parseGraph:

[6] Invalid Node - head.cls_preds.0.bias_DequantizeLinear

head.cls_preds.0.bias_DequantizeLinear: only activation types allowed as input to this layer.

[07/11/2024-06:35:43] [E] Failed to parse onnx file

[07/11/2024-06:35:43] [I] Finish parsing network model

[07/11/2024-06:35:43] [E] Parsing model failed

[07/11/2024-06:35:43] [E] Failed to create engine from model or file.

[07/11/2024-06:35:43] [E] Engine set up failed

&&&& FAILED TensorRT.trtexec [TensorRT v8502] # /usr/src/tensorrt/bin/trtexec --verbose --onnx=yolox_s.onnx --saveEngine=yolox_s.engine --int8 --minShapes=input:1x3x640x640 --optShapes=input:2x3x640x640 --maxShapes=input:4x3x640x640 --workspace=4096

Environment

TensorRT Version: 8.5.5.2
GPU Type: Jetson Orin Nano
CUDA Version: 11.4
CUDNN Version: 8.6
Operating System + Version: ubuntu20.04
Python Version (if applicable): python3.8.10
**Docker container: nvcr.io/nvidia/deepstream-l4t:6.2-samples

AakankshaS · July 11, 2024, 8:17am

Hi @zhang.ga ,
Can you help us with the model and repro steps?

Thanks

zhang.ga · July 11, 2024, 9:00am

Repro steps:
1 Quantizate onnx fp16 model to onnx int8 model
2 Run trtexec to convert onnx_int8 to tensorrt engine:
/usr/src/tensorrt/bin/trtexec --verbose --onnx=yolox_s.onnx --saveEngine=yolox_s.engine --int8 --minShapes=input:1x3x640x640 --optShapes=input:2x3x640x640 --maxShapes=input:4x3x640x640 --workspace=4096

quantization.py.txt (2.1 KB)
yolox_s_int8.onnx.txt (8.7 MB)

zhang.ga · July 12, 2024, 6:41am

Hi, have you reproduced the issue?

zhang.ga · July 28, 2024, 8:30am

Hello， anybody here？

Topic		Replies	Views
Convert int8-onnx model to trt engine? TensorRT onnx	6	1057	April 29, 2023
Failed to create tensorrt engine from QAT onnx model Jetson AGX Orin tensorrt , onnx	3	995	January 16, 2023
&&&& FAILED TensorRT.trtexec [TensorRT v8502] Jetson Orin Nano tensorrt , cuda , ubuntu	7	978	August 30, 2023
[TensorRT 7.0.0] Assertion failed: convertOnnxWeights(initializer, &weights, ctx) TensorRT tensorrt	5	1527	April 19, 2022
Error Code 10: Internal Error (Could not find any implementation for node {ForeignNode[lm_head.bias.../Cast]}.) TensorRT	2	997	March 23, 2023
Error converting onnx to TensorRT engine on Xavier TensorRT tensorrt , onnx	2	733	January 7, 2022
I am trying to convert the ONNX SSD mobilnet v2 model into TensorRT Engine. I am getting the below error Jetson AGX Xavier tensorrt , jetson	8	788	December 8, 2021
Build TRT engine with onnx QAT model throws segmentation fault TensorRT	3	1267	August 12, 2021
Building TensorRT 8 engine from ONNX quantized model fails TensorRT	4	885	October 1, 2021
Error while working with trtexec to create an engine with onnx file TensorRT	6	1551	July 14, 2022

Failed to convert quantized onnx model to engine

Description

Environment

Related topics