How to set my own quantized weigit and bias scale(not activation)?


Hi all,

After reading TensorRT-Best-Practices.pdf, TensorRT-Developer-Guide.pdf, sampleINT8 & sampleINT8API, i can not find a way to set my own quantized weight and bias scale into layer building process(not for activation).

For example, a quantized conv weight & bias can be initialized as nvinfer1::Weights {nvinfer1::DataType::kINT8, data_ptr, data_length}, but scale or range is not included, the ILayer(and derived class) does not have any setDynamicRange similar function also.

So question is, is that possable for users to create a fully user self quantized network directly(not using the TensorRT INT8 calibrator route)?


TensorRT Version: up to 7.2
GPU Type: Turing and later
Nvidia Driver Version: 450 and above
CUDA Version: 10.2 and above
CUDNN Version: 7.2 and above
Operating System + Version: both windows and linux
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):

Relevant Files

Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)

Steps To Reproduce

Please include:

  • Exact steps/commands to build your repro
  • Exact steps/commands to run your repro
  • Full traceback of errors encountered

Hi @314377460 ,
We have limited supports with provided quantized weights in TRT 7.x.
please refer to the below link for quantized weights

Thank you.