My questions:
1ăhow many calibration images should use to do calibration?
2ăhow to choose the most suitable calibration set?
3ăwhy seresnext50 int8 doesnât have much speedup?
4ăEven though I used all the training data for calibration, the accuracy still decreased a lot, how can I avoid it ?
5ăWhy resnet50 only uses 1000 pictures for calibration can get huge performance improvement and accuracy does not decreaseïŒ
Calibration images are used for collecting the input distribution required to compute the correct scaling factor.
Number of calibration images required will be based on the model and dataset.
The calibration algorithm can achieve good accuracy with just 100 random images also.
When preparing the calibration dataset, you should capture the expected distribution of data in typical inference scenarios. You want to make sure that the calibration dataset covers all the expected scenarios.
The calibration dataset shouldnât overlap with the training, validation or test datasets, in order to avoid a situation where the calibrated model only works well on the these datasets.
Try running the network in FP32. If it produces the correct result, it is possible that lower precision has insufficient dynamic range for the network.
You can also try setting manual dynamic ranges for each network tensor using setDynamicRange API.
I use train dataset for calibration, and test accuracy on it.From my experimental results, the calibrationed model even doesnât work well on calibration dataset.
From my experimental results, you can find that the two FP32 model have same accuracy as pytorch.
So is it possible that lower precision has insufficient dynamic range for the seresnext50 model? How can I deal with it?
some other question:
1.Regardless of accuracy, why INT8 model doesnât have much speedup compare to FP32 for model seresnext50? my GPU is P4
2.what is the visualization tools for tensorrt debugging?
I turn on INFO level message, found there are 371 layers in seresnext50. If I set first 150 layer to FP32, the accuracy from 79% up to 87%.
So now Iâm wondering why the int8 speed is not much improved?
Hi,
Can you share your code/script and model so I can try to reproduce this or further debug?
Meanwhile, could you please try to use âtrtexecâ command to test the model.Â
âtrtexecâ is useful for benchmarking networks and would be faster & easier to debug the issue.
Here is my script, model, test images and calibration dataset(the calibration dataset is also training dataset), you can test on it.
I mainly focus on speed and accuracy.