TensorRT - INT8 Quantization - weights - activations

Hello everyone,

Can you please tell me if the INT8 quantization with TensorRT (TRT5) is doing activations only quantizations,
or it is quantizing both weights and activations to INT8 precision?


It is quantizing both weight and activation to INT8 precision, but TRT doesn’t accept quantized weights as input from the user on TRT 5.x.



Thanks for the answer.

Do you know if the weights are quantized using the entropy calibrator? or are they quantized using min and max quantization?