Fake quantization ONNX model parse ERROR using TensorRT 8


A clear and concise description of the bug or issue.


TensorRT Version:
GPU Type: 2080
Nvidia Driver Version: 470.63.01
CUDA Version: 11.3
CUDNN Version: 8.0
Operating System + Version: Ubuntu 18.04
Python Version (if applicable): 3.7
PyTorch Version (if applicable): 1.9

I had successfully per-channel quantized and exported calibrated int8 model using pytorch quantization toolbox and torch ONNX export (using op-13).
The QuantizeLinear related to quantized weights blocks of the model include a Tensor with output dimension same as the number of channels of the layer they represent, thus also having an axis attribute set by the ONNX export
function fake_quantize_per_channel_affine defined in the op-13.
However when I load the ONNX file, the checker fails with the following error:

Traceback (most recent call last):
** File “/vayaalgo/Work/Pruning/Convert2TRT/main.py”, line 244, in **
** import_ONNX()**
** File “/vayaalgo/Work/Pruning/Convert2TRT/utils/models_quant.py”, line 15, in import_ONNX**
** engine = backend.prepare(depthnet, device=‘CUDA:0’)**
** File “/usr/local/lib/python3.7/dist-packages/onnx_tensorrt-8.0.1-py3.7.egg/onnx_tensorrt/backend.py”, line 235, in prepare**
** File “/usr/local/lib/python3.7/dist-packages/onnx/backend/base.py”, line 74, in prepare**
** onnx.checker.check_model(model)**
** File “/usr/local/lib/python3.7/dist-packages/onnx/checker.py”, line 91, in check_model**
** C.check_model(model.SerializeToString())**
onnx.onnx_cpp2py_export.checker.ValidationError: Unrecognized attribute: axis for operator QuantizeLinear

==> Context: Bad node spec: input: “features_0_0.weight” input: “465” input: “1337” output: “468” name: “QuantizeLinear_7” op_type: “QuantizeLinear” attribute { name: “axis” i: 0 type: INT }

How to solve this issue?

Request you to share the ONNX model and the script if not shared already so that we can assist you better.
Alongside you can try few things:

  1. validating your model with the below snippet


import sys
import onnx
filename = yourONNXmodel
model = onnx.load(filename)
2) Try running your model with trtexec command.
In case you are still facing issue, request you to share the trtexec “”–verbose"" log for further debugging

quantized_depthnet.onnx (7.7 MB)
Please find the attached ONNX file describing my model.
The model = onnx.load(filename) command loads the model, however onnx.checker.check_model(model) fails with the same error as above: onnx.onnx_cpp2py_export.checker.ValidationError: Unrecognized attribute: axis for operator QuantizeLinear


Looks like there is something wrong with pytorch to onnx conversion. Please refer following link/user guide to make sure you’re converting correctly.

Thank you.