TRT INT8 Quantify: Accuracy depend on Calibration dataset?

xxdyx110 · November 27, 2019, 3:39am

First paste my experimental results:
Resnet50 (Pytorch official provided, trained by imagenet)
https://drive.google.com/open?id=1iwe_bX41BHIIsNFy-37pIk4wyozLEl3o
The calibration dataset contains 1000 images from imagenet, it works fine.

SeResNext50 (My custom dataset, 4500+ images used in training)
https://drive.google.com/open?id=14vTnAIZvR_QuJ-txWePuzn1GxzzRwmAT
The calibration contains 400 images from train dataset, the accuray is too low.

But when I use all 4500+ train dataset to do calibration. the result is below:
https://drive.google.com/open?id=1MLgvgkXUMqLbMOuzb_EhHvE0FLa4E4NW
the accuracy from 25.3% increase to 79.4%

My questions:
1、how many calibration images should use to do calibration?
2、how to choose the most suitable calibration set?
3、why seresnext50 int8 doesn’t have much speedup?
4、Even though I used all the training data for calibration, the accuracy still decreased a lot, how can I avoid it ?
5、Why resnet50 only uses 1000 pictures for calibration can get huge performance improvement and accuracy does not decrease？

SunilJB · November 27, 2019, 5:19am

Hi,

Calibration images are used for collecting the input distribution required to compute the correct scaling factor.
Number of calibration images required will be based on the model and dataset.
The calibration algorithm can achieve good accuracy with just 100 random images also.

When preparing the calibration dataset, you should capture the expected distribution of data in typical inference scenarios. You want to make sure that the calibration dataset covers all the expected scenarios.
The calibration dataset shouldn’t overlap with the training, validation or test datasets, in order to avoid a situation where the calibrated model only works well on the these datasets.

Try running the network in FP32. If it produces the correct result, it is possible that lower precision has insufficient dynamic range for the network.

You can also try setting manual dynamic ranges for each network tensor using setDynamicRange API.

Please refer below link for more details:
https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#enable_int8_c
https://github.com/NVIDIA/TensorRT/tree/release/6.0/samples/opensource/sampleINT8API

For debugging turn on INFO level messages from the log stream and check what TensorRT is reporting. You can also use visualization tools.

Thanks

xxdyx110 · November 27, 2019, 8:09am

Thanks for your reply!!!

I use train dataset for calibration, and test accuracy on it.From my experimental results, the calibrationed model even doesn’t work well on calibration dataset.

From my experimental results, you can find that the two FP32 model have same accuracy as pytorch.
So is it possible that lower precision has insufficient dynamic range for the seresnext50 model? How can I deal with it?

some other question:
1.Regardless of accuracy, why INT8 model doesn’t have much speedup compare to FP32 for model seresnext50? my GPU is P4
2.what is the visualization tools for tensorrt debugging?

xxdyx110 · November 28, 2019, 2:22am

I turn on INFO level message, found there are 371 layers in seresnext50. If I set first 150 layer to FP32, the accuracy from 79% up to 87%.
So now I’m wondering why the int8 speed is not much improved?

SunilJB · November 28, 2019, 7:07am

Hi,
Can you share your code/script and model so I can try to reproduce this or further debug?

Meanwhile, could you please try to use “trtexec” command to test the model.
“trtexec” is useful for benchmarking networks and would be faster & easier to debug the issue.

Thanks

xxdyx110 · December 2, 2019, 2:39pm

I test “trtexec” but it couldn’t count accuracy.

https://drive.google.com/open?id=1hMhXChMXGLESzwxs9iMfctyhl03jNkXF

Here is my script, model, test images and calibration dataset(the calibration dataset is also training dataset), you can test on it.
I mainly focus on speed and accuracy.

Hope for your result !
Thanks!

xxdyx110 · December 4, 2019, 5:57am

Any update?

xxdyx110 · December 10, 2019, 3:50am

Do you have any result on the model?

SunilJB · March 4, 2020, 9:07am

Hi,

Fix will be available in next release. Please stay tuned.

Thanks

NVES · May 15, 2021, 2:07pm

Hi, Please refer to the below links to perform inference in INT8
https://github.com/NVIDIA/TensorRT/blob/master/samples/opensource/sampleINT8/README.md

Thanks!

Topic		Replies	Views
Generate the INT8 calibration In TensorRT GPU-Accelerated Libraries	0	643	October 23, 2017
TensorRT sampleINT8API Demo low accuracy TensorRT	3	422	April 2, 2020
tensorRT int8 GPU-Accelerated Libraries	0	906	June 8, 2017
convert faster rcnn to int8 GPU-Accelerated Libraries	0	906	October 12, 2017
Int8 calibration accuracy loss TensorRT	6	1944	August 30, 2019
TensorRT INT8 inference, the result is totally wrong! TensorRT	7	874	May 13, 2020
Int8 Calibration is not accurate .. see image diff with and without TensorRT	20	2659	January 4, 2021
TensorRT trtexec implementation of Resnet50 INT8 precision TensorRT	4	1380	September 10, 2020
Int8 calibration on tensorrt 5 is much slower than tensorrt 4 TensorRT	1	818	February 28, 2019
Generate the INT8 calibration GPU-Accelerated Libraries	0	559	October 23, 2017

TRT INT8 Quantify: Accuracy depend on Calibration dataset?

Related topics