The result of tensorrt qat is not equal to the result of pytorch qat in int8 mode

Description

I want to use setdynamicrange() interface to achieve qat quantization in tensorrt,so I set the default_qat config in pytorch and also got a qat version result .I multiply the scale parameter of each tensor by 127 and set it to tensor->setdynamicrange(), and then use the builder->setint8mode(true) and builder->setStrictTypeConstraints(true) functions to ensure that tensorrt uses int8 mode.But when I tested with a model containing only one conv layer, I found that this layer seemed to fall back to fp32

Environment

TensorRT Version: 5.1.3
GPU Type: gtx2060
Nvidia Driver Version:
CUDA Version: 10.0
CUDNN Version: 7.5
Operating System + Version:ubuntu 16.04
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable): 1.4.0
Baremetal or Container (if container which image + tag):

Relevant Files


Steps To Reproduce

You can see from the first picture that I set the scale parameter correctly, and at the same time you can see from the second picture that this layer is running on higher pricision.I think if this layer is running on int8 mode, then it should output this line of information:


The result of this model is different from the result of pytorch qat, but it is the same as the result of fp32, which can also explain my point of view.
Is there any way I can avoid this fall back mechanism?

You can set the builder config flag BuilderFlag::kSTRICT_TYPES to force the network or layer precision.

Thanks

I think config->createBuilderConfig is a interface of tensorrt 7.0, is it not the same as builder->setStrictTypeConstraints(true) in tensorrt 5.1.3?

API will be removed in a future release, use IBuilderConfig::setFlag instead.
Please refer below link for more details:
https://docs.nvidia.com/deeplearning/tensorrt/archives/tensorrt-700/tensorrt-api/c_api/classnvinfer1_1_1_i_builder.html#af0499e6f27a54a66e1a7eb72c262fadb

Thanks

Thank you very much for your reply. I also want to ask if I can set the scale value to tensorrt correctly, can I get the same result as pytorch qat ? I noticed that the scale parameter of weight cannot be set in tensorrt, so I would like to ask whether the processing of weight in tensorrt int8 mode is the same as pytorch qat?

Please refer below link:

Thanks