Convert int8-onnx model to trt engine?

I tried to convert model.int8.onnx to tensorRT engine, but encountered error…

The model was already quantized in onnxRT framework and it has Quantized/Dequantized linear layer, whuch are supported by tensorRT accoring to this page (onnx-tensorrt/operators.md at main · onnx/onnx-tensorrt · GitHub)

I used command:
./trtexec --onnx=test.int8-onnx-calibrated.onnx --saveEngine=engine.trt

I thoght that it could be converted … But errors appeared, which are following.

It’s not possible to convert int8-onnx model to trt engine?
Best regards.

[02/16/2023-18:18:25] [I] [TRT] ----------------------------------------------------------------
[02/16/2023-18:18:25] [I] [TRT] Input filename:   test.int8-onnx-calibrated.onnx
[02/16/2023-18:18:25] [I] [TRT] ONNX IR version:  0.0.6
[02/16/2023-18:18:25] [I] [TRT] Opset version:    11
[02/16/2023-18:18:25] [I] [TRT] Producer name:    onnx.utils.extract_model
[02/16/2023-18:18:25] [I] [TRT] Producer version:
[02/16/2023-18:18:25] [I] [TRT] Domain:
[02/16/2023-18:18:25] [I] [TRT] Model version:    0
[02/16/2023-18:18:25] [I] [TRT] Doc string:
[02/16/2023-18:18:25] [I] [TRT] ----------------------------------------------------------------
[02/16/2023-18:18:25] [W] [TRT] onnx2trt_utils.cpp:364: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.

[02/16/2023-18:18:25] [E] Error[2]: [scaleNode.cpp::getChannelAxis::27] Error Code 2: Internal Error (features.0.bias_DequantizeLinear_quantize_scale_node: out of bounds channel axis 1. Number of input dimensions is 1.)
[02/16/2023-18:18:25] [E] [TRT] ModelImporter.cpp:720: While parsing node number 0 [DequantizeLinear -> "features.0.bias"]:
[02/16/2023-18:18:25] [E] [TRT] ModelImporter.cpp:721: --- Begin node ---
[02/16/2023-18:18:25] [E] [TRT] ModelImporter.cpp:722: input: "features.0.bias_quantized"
input: "features.0.bias_quantized_scale"
input: "features.0.bias_quantized_zero_point"
output: "features.0.bias"
name: "features.0.bias_DequantizeLinear"
op_type: "DequantizeLinear"

[02/16/2023-18:18:25] [E] [TRT] ModelImporter.cpp:723: --- End node ---
[02/16/2023-18:18:25] [E] [TRT] ModelImporter.cpp:726: ERROR: ModelImporter.cpp:179 In function parseGraph:
[6] Invalid Node - features.0.bias_DequantizeLinear
[scaleNode.cpp::getChannelAxis::27] Error Code 2: Internal Error (features.0.bias_DequantizeLinear_quantize_scale_node: out of bounds channel axis 1. Number of input dimensions is 1.)

[02/16/2023-18:18:25] [E] Failed to parse onnx file
[02/16/2023-18:18:25] [I] Finish parsing network model
[02/16/2023-18:18:25] [E] Parsing model failed
[02/16/2023-18:18:25] [E] Failed to create engine from model.
[02/16/2023-18:18:25] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v8205] # ./trtexec --onnx=test.int8-onnx-calibrated.onnx

A clear and concise description of the bug or issue.

Environment

TensorRT Version:
GPU Type:
Nvidia Driver Version:
CUDA Version:
CUDNN Version:
Operating System + Version:
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):

Relevant Files

Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)

Steps To Reproduce

Please include:

  • Exact steps/commands to build your repro
  • Exact steps/commands to run your repro
  • Full traceback of errors encountered

Hi,

We recommend that you try the latest TensorRT version 8.5.3 and please share with us complete verbose logs and issue repro ONNX model if you still face the issue.

Thank you.

I think it also encountered the same issue

[02/17/2023-09:54:47] [I] TensorRT version: 8.5.3
[02/17/2023-09:54:53] [I] [TRT] [MemUsageChange] Init CUDA: CPU +533, GPU +0, now: CPU 540, GPU 1282 (MiB)
[02/17/2023-09:54:53] [I] Start parsing network model
[02/17/2023-09:54:53] [I] [TRT] ----------------------------------------------------------------
[02/17/2023-09:54:53] [I] [TRT] Input filename:   test.int8-onnx-calibrated.onnx
[02/17/2023-09:54:53] [I] [TRT] ONNX IR version:  0.0.6
[02/17/2023-09:54:53] [I] [TRT] Opset version:    11
[02/17/2023-09:54:53] [I] [TRT] Producer name:    onnx.utils.extract_model
[02/17/2023-09:54:53] [I] [TRT] Producer version:
[02/17/2023-09:54:53] [I] [TRT] Domain:
[02/17/2023-09:54:53] [I] [TRT] Model version:    0
[02/17/2023-09:54:53] [I] [TRT] Doc string:
[02/17/2023-09:54:53] [I] [TRT] ----------------------------------------------------------------
[02/17/2023-09:54:53] [W] [TRT] onnx2trt_utils.cpp:364: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[02/17/2023-09:54:53] [E] Error[2]: [scaleNode.cpp::getChannelAxis::27] Error Code 2: Internal Error (features.0.bias_DequantizeLinear_quantize_scale_node: out of bounds channel axis 1. Number of input dimensions is 1.)
[02/17/2023-09:54:53] [E] [TRT] ModelImporter.cpp:720: While parsing node number 0 [DequantizeLinear -> "features.0.bias"]:
[02/17/2023-09:54:53] [E] [TRT] ModelImporter.cpp:721: --- Begin node ---
[02/17/2023-09:54:53] [E] [TRT] ModelImporter.cpp:722: input: "features.0.bias_quantized"
input: "features.0.bias_quantized_scale"
input: "features.0.bias_quantized_zero_point"
output: "features.0.bias"
name: "features.0.bias_DequantizeLinear"
op_type: "DequantizeLinear"

[02/17/2023-09:54:53] [E] [TRT] ModelImporter.cpp:723: --- End node ---
[02/17/2023-09:54:53] [E] [TRT] ModelImporter.cpp:726: ERROR: ModelImporter.cpp:179 In function parseGraph:
[6] Invalid Node - features.0.bias_DequantizeLinear
[scaleNode.cpp::getChannelAxis::27] Error Code 2: Internal Error (features.0.bias_DequantizeLinear_quantize_scale_node: out of bounds channel axis 1. Number of input dimensions is 1.)
[02/17/2023-09:54:53] [E] Failed to parse onnx file
[02/17/2023-09:54:53] [I] Finish parsing network model
[02/17/2023-09:54:53] [E] Parsing model failed
[02/17/2023-09:54:53] [E] Failed to create engine from model or file.
[02/17/2023-09:54:53] [E] Engine set up failed

Please share the issue repro ONNX model with us here or via DM for better debugging.

Thank you.

The suspicious reason is that conv has three input operators: input, weghit, bias in my onnx model.

However, trtexec assumes two input channels: input, weight+bias(?). When using TensorRT/tools/pytorch-quantization at main · NVIDIA/TensorRT · GitHub and quantize the model, a generated onnx model has 2 input and it seems to be possible to convert to trt.engine via trtexec.

No support of convirsion of “onnx-int8” model to trt.engine?

コメント 2023-02-17 160807

No methods?

Hi @Chubby ,
Are you still facing the issue?

Thanks