Description
Hi all,
After reading TensorRT-Best-Practices.pdf, TensorRT-Developer-Guide.pdf, sampleINT8 & sampleINT8API, i can not find a way to set my own quantized weight and bias scale into layer building process(not for activation).
For example, a quantized conv weight & bias can be initialized as nvinfer1::Weights {nvinfer1::DataType::kINT8, data_ptr, data_length}
, but scale or range is not included, the ILayer
(and derived class) does not have any setDynamicRange similar function also.
So question is, is that possable for users to create a fully user self quantized network directly(not using the TensorRT INT8 calibrator route)?
Environment
TensorRT Version: up to 7.2
GPU Type: Turing and later
Nvidia Driver Version: 450 and above
CUDA Version: 10.2 and above
CUDNN Version: 7.2 and above
Operating System + Version: both windows and linux
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):
Relevant Files
Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)
Steps To Reproduce
Please include:
- Exact steps/commands to build your repro
- Exact steps/commands to run your repro
- Full traceback of errors encountered