Using TRT Quantization Toolkit

Description

Hello,

I’m exploring the TRT Quantization Toolkit. I would like to use a simple example to get things clear.
I’ll have a single Conv2d layer network with pretrained weights.

As I understood, in order to calibrate the network, I need to swap my original Conv2d layer with the QuantConv2d layer which has input and weight quantizers. After doing this I paid attention that after doing this the named_modules of the network now include 3 layers, the QuantConv2d, _input_quantizer and a _weight_quantizer.
When collecting statistics should I just source my input to the QuantConv2d or do something like described here?

image

Environment

TensorRT Version: 8
GPU Type: 2080 TI
Nvidia Driver Version: 470.57.02
CUDA Version: 11.3
CUDNN Version: 8.0
Operating System + Version: Ubuntu 18.04
Python Version (if applicable): 3.7
PyTorch Version (if applicable): 1.9

Hi @alex.spivakovsky
Please refer to below link in case it’s helpful in your case:
https://docs.nvidia.com/deeplearning/tensorrt/pytorch-quantization-toolkit/docs/index.html
https://docs.nvidia.com/deeplearning/tensorrt/pytorch-quantization-toolkit/docs/tutorials/quant_resnet50.html

Thanks