TF - TRT Quantization aware training - Quantization range was not found ERROR

Hello everyone,

I want to experiment INT8 quantization-aware training supported by TF-TRT (TRT5).

So I trained a deep convolutional NN model by adding quantization node (tf.quantization.quantize_and_dequantize) after each convolutional block as mentioned in the documentation.

Here is a segment of the model summary:

Layer (type) ###Output Shape### Param ### Connected to

block_1/conv2d_1 (Conv2D)###(6, 800, 800, 32) ### 9248 ### tf_op_layer_QuantizeAndDequantize


block_1/activation_1 (Activatio ###(6, 800, 800, 32) ### 0 ### block_1/conv2d_1[0][0]


tf_op_layer_QuantizeAndDequanti ###[(6, 800, 800, 32)] ### 0 ### block_1/activation_1[0][0]
res_block_2/conv2d_0 (Conv2D) ### (6, 400, 400, 24) ### 6936### tf_op_layer_QuantizeAndDequantize


res_block_2/activation_0 ###(Activ (6, 400, 400, 24)### 0 ### res_block_2/conv2d_0[0][0]


tf_op_layer_QuantizeAndDequanti ###[(6, 400, 400, 24)]### 0 ### res_block_2/activation_0[0][0]

However, when I run the TRT5 converter with use_calibration = False, I get an error from TRT as follows: Quantization range was not found for res_block_2/activation_0/Relu

Why am I getting this error even if I added the quantization nodes in the right place?

Best,

Hi,

I’m not yet too familiar with Quantization aware training, but some things that might help in the meantime:

TensorRT 7.0 was released yesterday with support for Quantization-aware training: https://docs.nvidia.com/deeplearning/sdk/tensorrt-archived/tensorrt-700/tensorrt-release-notes/tensorrt-7.html#rel_7-0-0

Maybe this will help convert these ops to TensorRT and see if it works there instead? See this section in the docs on it: https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#work-with-qat-networks

TRT5 fails in case of missing a quantization range that is required by TRT.
TRT6 fixed that problem by falling corresponding ops to higher precision in case of missing ranges.

If you still see errors with TRT6, the only solution I can think of is to add those missing ranges. It’s a trial and error process.

Docs: https://docs.nvidia.com/deeplearning/frameworks/tf-trt-user-guide/index.html#quantization-training