TensorRT quantization uses int8 or uint8


TensorRT developer guide says the quantized range is [-128, 127], meaning it should use int8.
However, when I convert the tensorflow quantization-aware-trained model to ONNX, and then to TRT, such error comes out –

ModelImporter.cpp:778: ERROR: builtin_op_importers.cpp:1173 In function QuantDequantLinearHelper:
[6] Assertion failed: shiftIsAllZeros(zeroPoint) && "TRT only supports symmetric quantization - zeroPt must be all zeros

Does it mean I should use uint8 (0, 255)?


TensorRT Version: 8.6
GPU Type: Nvidia a4000
Nvidia Driver Version:
CUDA Version: 11.8
CUDNN Version:
Operating System + Version:
Python Version (if applicable):
TensorFlow Version (if applicable): 2.12
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):

Hi, Please refer to the below links to perform inference in INT8