Is it normal the tensorrt int8 model with dynamic input shape (batch size) more precise than the one with fixed input shape?


When I convert my pretrained model to tensorrt int8 engine, I find that if the input shape is fixed, i.e.,

profile.set_shape(, min=(32, 3, 512, 512), opt=(32, 3, 512, 512), max=(32, 3, 512, 512))

, the performance drops dramatically comparing to the one generated with dynamic input shape:

profile.set_shape(,  min=(16, 3, 512, 512), opt=(32, 3, 512, 512), max=(48, 3, 512, 512)). 

I want to know is it normal?


TensorRT Version:
GPU Type: NVIDIA GeForce RTX 3060
Nvidia Driver Version: 11.4
CUDA Version: cuda_11.3.r11.3
CUDNN Version: 8.1.1
Operating System + Version: 16.04.6 LTS (GNU/Linux 4.4.0-142-generic x86_64)
Python Version (if applicable): 3.6
TensorFlow Version (if applicable):
PyTorch Version (if applicable): ‘1.10.2+cu113’
Baremetal or Container (if container which image + tag):


Could you please try on the latest TensorRT version 8.5.2 and let us know if you still face this issue?
Please share the minimal issue repro ONNX model/scripts for us to try for better debugging.

Thank you.

The problem was solved, it was actually a bug in my code. When I generated the fixed shape engine, I used a wrong calibrator where the data size was smaller than the allocated GPU memory, so the uninitialized GPU memory messed up the calibration process and further degraded the performance of the engine.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.