Hi,
I have been using the INT8 Entropy Calibrator 2 for INT8 quantization in Python and it’s been working well (TensorRT 10.0.1). The example of how I use the INT8 Entropy Calibrator 2 can be found in the official TRT GitHub repo (TensorRT/samples/python/efficientdet/build_engine.py at release/10.0 · NVIDIA/TensorRT · GitHub)
The warning I’ve been getting starting with TensorRT 10.1 is that the INT8 Entropy Calibrator 2 has been deprecated and superseded by explicit quantization.
I’ve read the official document on the difference between the implicit and explicit quantization processes (Developer Guide :: NVIDIA Deep Learning TensorRT Documentation) and they seem to work differently. The explicit quantization seems to expect a network to have QuantizeLayer and DequantizeLayer layers which my networks don’t. The implicit quantization can be used when those layers are not present in a network. Therefore, I am confused about how the implicit quantization can be superseded by the explicit quantization since they seem to work differently.
So, my question is what needs to be modified in the standard INT8 Calibrator 2 quantization method (TensorRT/samples/python/efficientdet/build_engine.py at release/10.0 · NVIDIA/TensorRT · GitHub) for the deprecation warning not to show up ? Couldn’t find any example using a newer trt version (10.1 and up)
Thank you!
Environment
TensorRT Version: 10.0.1
GPU Type: 3090
Operating System + Version: Windows 10
Python version: 3.9.19