TRT INT8 Quantify: Accuracy depend on Calibration dataset?

First paste my experimental results:
Resnet50 (Pytorch official provided, trained by imagenet)
The calibration dataset contains 1000 images from imagenet, it works fine.

SeResNext50 (My custom dataset, 4500+ images used in training)
The calibration contains 400 images from train dataset, the accuray is too low.

But when I use all 4500+ train dataset to do calibration. the result is below:
the accuracy from 25.3% increase to 79.4%

My questions:
1、how many calibration images should use to do calibration?
2、how to choose the most suitable calibration set?
3、why seresnext50 int8 doesn’t have much speedup?
4、Even though I used all the training data for calibration, the accuracy still decreased a lot, how can I avoid it ?
5、Why resnet50 only uses 1000 pictures for calibration can get huge performance improvement and accuracy does not decrease?


Calibration images are used for collecting the input distribution required to compute the correct scaling factor.
Number of calibration images required will be based on the model and dataset.
The calibration algorithm can achieve good accuracy with just 100 random images also.

When preparing the calibration dataset, you should capture the expected distribution of data in typical inference scenarios. You want to make sure that the calibration dataset covers all the expected scenarios.
The calibration dataset shouldn’t overlap with the training, validation or test datasets, in order to avoid a situation where the calibrated model only works well on the these datasets.

Try running the network in FP32. If it produces the correct result, it is possible that lower precision has insufficient dynamic range for the network.

You can also try setting manual dynamic ranges for each network tensor using setDynamicRange API.

Please refer below link for more details:

For debugging turn on INFO level messages from the log stream and check what TensorRT is reporting. You can also use visualization tools.


Thanks for your reply!!!

I use train dataset for calibration, and test accuracy on it.From my experimental results, the calibrationed model even doesn’t work well on calibration dataset.

From my experimental results, you can find that the two FP32 model have same accuracy as pytorch.
So is it possible that lower precision has insufficient dynamic range for the seresnext50 model? How can I deal with it?

some other question:
1.Regardless of accuracy, why INT8 model doesn’t have much speedup compare to FP32 for model seresnext50? my GPU is P4
2.what is the visualization tools for tensorrt debugging?

I turn on INFO level message, found there are 371 layers in seresnext50. If I set first 150 layer to FP32, the accuracy from 79% up to 87%.
So now I’m wondering why the int8 speed is not much improved?

Can you share your code/script and model so I can try to reproduce this or further debug?

Meanwhile, could you please try to use “trtexec” command to test the model. 
“trtexec” is useful for benchmarking networks and would be faster & easier to debug the issue.


I test “trtexec” but it couldn’t count accuracy.

Here is my script, model, test images and calibration dataset(the calibration dataset is also training dataset), you can test on it.
I mainly focus on speed and accuracy.

Hope for your result !

Any update?

Do you have any result on the model?


Fix will be available in next release. Please stay tuned.