Trtexec cannot convert QAT onnx model to trt model

QQQQ · August 3, 2022, 2:53pm

HI, I used Tensorflow 2.4 to train a mix-precision model and then used the quantization aware training inside to fine-tune the model and then saved it as a .onnx file. But when I used the following command to converting my onnx model to trt model it raised an error. Details are shown below.
JP 4.6.2
Tf 2.4
tf2onnx 1.12

./trtexec --onnx=quantized_dirfrom_qat_model.onnx --saveEngine=quantized_dirfrom_qat_model_trt8.2.trt --minShapes=input_3:1x224x224x1 --optShapes=input_3:2x224x224x1 --maxShapes=input_3:2x224x224x1 --workspace=4096 --verbose  --int8

[08/03/2022-14:36:24] [V] [TRT] sequential/mobilenet_0.15_224/quant_conv_dw_6/depthwise;sequential/mobilenet_0.15_224/quant_conv_dw_6/LastValueQuant/FakeQuantWithMinMaxVarsPerChannel_dequant [DequantizeLinear] inputs: [sequential/mobilenet_0.15_224/quant_conv_dw_6/depthwise;sequential/mobilenet_0.15_224/quant_conv_dw_6/LastValueQuant/FakeQuantWithMinMaxVarsPerChannel -> (38, 1, 3, 3)[INT8]], [scale__99 -> (38)[FLOAT]], [zero_point__52 -> (38)[INT8]],
[08/03/2022-14:36:24] [V] [TRT] Registering layer: sequential/mobilenet_0.15_224/quant_conv_dw_6/depthwise;sequential/mobilenet_0.15_224/quant_conv_dw_6/LastValueQuant/FakeQuantWithMinMaxVarsPerChannel for ONNX node: sequential/mobilenet_0.15_224/quant_conv_dw_6/depthwise;sequential/mobilenet_0.15_224/quant_conv_dw_6/LastValueQuant/FakeQuantWithMinMaxVarsPerChannel
[08/03/2022-14:36:24] [E] Error[3]: sequential/mobilenet_0.15_224/quant_conv_dw_6/depthwise;sequential/mobilenet_0.15_224/quant_conv_dw_6/LastValueQuant/FakeQuantWithMinMaxVarsPerChannel: invalid weights type of Int8
Segmentation fault (core dumped)

I also attached my test onnx model
quantized_dirfrom_qat_model.onnx (232.8 KB)

AastaLLL · August 4, 2022, 3:33am

Hi,

Thanks for reporting this.
Confirmed that we can reproduce this issue on Orin.

Do you use JetPack 5.0.1 DP?
For JetPack 5, the TensorRT version should be 8.4, which is different from your filename.

Thanks.

QQQQ · August 4, 2022, 5:56am

Here is the information from jtop, and still not working.

[08/04/2022-05:55:04] [V] [TRT] Searching for input: sequential/mobilenet_0.15_224/quant_conv_dw_6/depthwise;sequential/mobilenet_0.15_224/quant_conv_dw_6/LastValueQuant/FakeQuantWithMinMaxVarsPerChannel
[08/04/2022-05:55:04] [V] [TRT] Searching for input: scale__99
[08/04/2022-05:55:04] [V] [TRT] Searching for input: zero_point__52
[08/04/2022-05:55:04] [V] [TRT] sequential/mobilenet_0.15_224/quant_conv_dw_6/depthwise;sequential/mobilenet_0.15_224/quant_conv_dw_6/LastValueQuant/FakeQuantWithMinMaxVarsPerChannel_dequant [DequantizeLinear] inputs: [sequential/mobilenet_0.15_224/quant_conv_dw_6/depthwise;sequential/mobilenet_0.15_224/quant_conv_dw_6/LastValueQuant/FakeQuantWithMinMaxVarsPerChannel -> (38, 1, 3, 3)[INT8]], [scale__99 -> (38)[FLOAT]], [zero_point__52 -> (38)[INT8]], 
[08/04/2022-05:55:04] [V] [TRT] Registering layer: sequential/mobilenet_0.15_224/quant_conv_dw_6/depthwise;sequential/mobilenet_0.15_224/quant_conv_dw_6/LastValueQuant/FakeQuantWithMinMaxVarsPerChannel for ONNX node: sequential/mobilenet_0.15_224/quant_conv_dw_6/depthwise;sequential/mobilenet_0.15_224/quant_conv_dw_6/LastValueQuant/FakeQuantWithMinMaxVarsPerChannel
[08/04/2022-05:55:04] [E] Error[3]: sequential/mobilenet_0.15_224/quant_conv_dw_6/depthwise;sequential/mobilenet_0.15_224/quant_conv_dw_6/LastValueQuant/FakeQuantWithMinMaxVarsPerChannel: invalid weights type of Int8
Segmentation fault (core dumped)

QQQQ · August 5, 2022, 3:25am

HI any updates?
Here is the information from jtop, and still not working.

[08/04/2022-05:55:04] [V] [TRT] Searching for input: sequential/mobilenet_0.15_224/quant_conv_dw_6/depthwise;sequential/mobilenet_0.15_224/quant_conv_dw_6/LastValueQuant/FakeQuantWithMinMaxVarsPerChannel
[08/04/2022-05:55:04] [V] [TRT] Searching for input: scale__99
[08/04/2022-05:55:04] [V] [TRT] Searching for input: zero_point__52
[08/04/2022-05:55:04] [V] [TRT] sequential/mobilenet_0.15_224/quant_conv_dw_6/depthwise;sequential/mobilenet_0.15_224/quant_conv_dw_6/LastValueQuant/FakeQuantWithMinMaxVarsPerChannel_dequant [DequantizeLinear] inputs: [sequential/mobilenet_0.15_224/quant_conv_dw_6/depthwise;sequential/mobilenet_0.15_224/quant_conv_dw_6/LastValueQuant/FakeQuantWithMinMaxVarsPerChannel -> (38, 1, 3, 3)[INT8]], [scale__99 -> (38)[FLOAT]], [zero_point__52 -> (38)[INT8]], 
[08/04/2022-05:55:04] [V] [TRT] Registering layer: sequential/mobilenet_0.15_224/quant_conv_dw_6/depthwise;sequential/mobilenet_0.15_224/quant_conv_dw_6/LastValueQuant/FakeQuantWithMinMaxVarsPerChannel for ONNX node: sequential/mobilenet_0.15_224/quant_conv_dw_6/depthwise;sequential/mobilenet_0.15_224/quant_conv_dw_6/LastValueQuant/FakeQuantWithMinMaxVarsPerChannel
[08/04/2022-05:55:04] [E] Error[3]: sequential/mobilenet_0.15_224/quant_conv_dw_6/depthwise;sequential/mobilenet_0.15_224/quant_conv_dw_6/LastValueQuant/FakeQuantWithMinMaxVarsPerChannel: invalid weights type of Int8
Segmentation fault (core dumped)

AastaLLL · August 5, 2022, 3:56am

Hi,

It looks like you are using Xavier instead of Orin.
I will move your topic to the Xavier board.

In your QAT model, there are some usages that are not supported by TensorRT:

Int8 zero-point
Quantizing the bias
Asymmetric quantization (zero-point != 0)

Please try our newly released quantization toolkit for TensorFlow to see if it helps:

Thanks.

QQQQ · August 8, 2022, 8:45am

from the link you commented, I saw the requirement for trt is 8.4, but we are using trt 8.2. Does this tool compatible with trt 8.2?

AastaLLL · August 9, 2022, 1:53am

Hi,

Do you have dependencies on JetPack4.6.2?
If not, you can get TensorRT 8.4 with JetPack 5.

Thanks.

system · August 31, 2022, 2:25am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Problem with converting ONNX quantized models to TensorRT Jetson AGX Xavier tensorrt , onnx	6	1774	December 22, 2021
How can we know we have convert the onnx to int8trt rather than Float32? TensorRT tensorrt	23	2023	June 14, 2021
Build TRT engine with onnx QAT model throws segmentation fault TensorRT	3	1328	August 12, 2021
Convert the onnx model to TRT engine by onnx2trt on AGX \| JetPack v4.4 Jetson AGX Xavier tensorrt	4	3245	October 18, 2021
Failed to create tensorrt engine from QAT onnx model Jetson AGX Orin tensorrt , onnx	3	1084	January 16, 2023
Converting to TRT a model from Quantization Aware Training without applying calibration TensorRT	5	1828	February 2, 2021
Qat model convert to onnx error Jetson Nano onnx	4	2706	June 23, 2021
Convert int8-onnx model to trt engine? TensorRT onnx	6	1160	April 29, 2023
TensorRT run ONNX model with Int8 issue TensorRT	9	4377	October 12, 2021
Some questions about TensorRT INT8, PTQ and QAT TensorRT tensorrt	5	1889	December 27, 2021

Trtexec cannot convert QAT onnx model to trt model

Related topics