INT8 Yolo model conversion led to accuracy drop in deepstream

srsjd · May 11, 2021, 8:27am

Hi there,
As stated here , I was able to calibrate and generate an int8 engine in the YOLO example. However, the performance(mAP) of the int8 model dropped about 7-15% compared with the fp32 model. Is this normal? How can I improve it?

My setup is the following:
Jetson Xavier
DeepStream 5.0
JetPack 4.4
TensorRT 7.1.3
NVIDIA GPU Driver Version 10.2

AastaLLL · May 11, 2021, 9:15am

Hi,

It’s possible to have some accuracy drop when inferencing with INT8.
The amount depends on the calibration and the model’s property.

But 7-15% seems too much.
Would you mind to share the original file (ex. onnx, pb or .caffe) with us?
As well as the data and source you used for generating the calibration cache?

Thanks.

srsjd · May 12, 2021, 5:08am

Hi,

Thanks for the swift response. Original model files I used were a darknet weight file and a cfg file. As for calibration, I firstly selected 200 random images from the training set as calib dataset. Then I used the entire training set as calib dataset. It seems that the latter option offered better performance.

I uploaded model files and a small subset of the training set.

Thanks in advance.

AastaLLL · May 17, 2021, 7:14am

Hi,

Thanks for sharing the data with us.

Have you tried the model with TensorRT API directly?
If not, would you mind to give it a try?

This will help us to distinguish the issue is from Deepstream or TensorRT.

Thanks.

srsjd · May 17, 2021, 7:18am

I tried. Two methods give very similar results with a tiny difference in mAP for less than 0.5%.

AastaLLL · May 19, 2021, 8:46am

Hi,

Thanks for sharing the subset with us.
Could you also share the source for generating calibration file with us?

Thanks.

srsjd · May 21, 2021, 3:10am

Here is the tensorrt API source, and the Deepstream source.

AastaLLL · May 24, 2021, 7:16am

Hi,

We are checking this issue internally.
Will update more with you later.

Thanks.

srsjd · May 24, 2021, 9:29am

Thanks. Looking forward to your update

AastaLLL · May 31, 2021, 9:48am

Hi,

We check the calibration shared in this comment.

In general, TensorRT will merge/combine several layers together for acceleration (ex. conv+scale+activation).
However, the layer are calibrated without merging in your cache file.

Not sure if this causes some unexpected accuracy drop.
Would you mind to try the calibration tool shared in the below GitHub again:

We have verified that the cache files in the GitHub can output the detections correctly.

Thanks.

srsjd · May 31, 2021, 9:56am

Thanks for your feedback!

I’ll give it a try.

srsjd · June 4, 2021, 8:31am

After using the cache file generated from the recommended repo in the DS yolo app, the performance speed dropped significantly, to about 8fps.

It seems that tensorrt did not know how to perform INT8 quantization based on the give calibration cache so it ended up making a FLOAT32 or 16 engine.

I might have misunderstood some of your statements. So when you say

did you test it out in the deepstream yolo-app?

Thanks again for your help.

AastaLLL · June 7, 2021, 3:46am

Hi,

Could you share your detailed procedure with us?

The INT8 is a configuration set by the user.
So the model will inference in INT8 mode if configure and cache provided correctly.

Ex.
config_infer_primary_yoloV3_tiny.txt

[property]
...
int8-calib-file=[the cache file generated above]
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=1
...

Thanks.

srsjd · June 8, 2021, 2:24am

I followed the demo#5 to create an onnx file, and followed demo#6 to calibrate and get a calibration cache. Then I used the cache in the deepstream yolo-app as int8-calib-file.

AastaLLL · June 14, 2021, 11:07am

Hi,

Could you share the .cfg, .weight, .onnx and the corresponding cache file with us?

More, we test the default YoloV3 Tiny model cache.
And can get the expected output result.

Please validate if this also works on your side.
Thanks.

srsjd · June 15, 2021, 9:14am

Here are the files.

When I used tensorrt-demo generated caches within the repo, they all worked fine. When I moved the cache into deepstream, I got the following:

ERROR: [TRT]: Calibration failure occurred with no scaling factors detected. This could be due to no int8 calibrator or insufficient custom scales for network layers. Please see int8 sample to setup calibration correctly.
ERROR: [TRT]: Builder failed while configuring INT8 mode.
Building engine failed!

I also tried the yolov3-tiny cache as you suggested, and the same thing happened – it only works for the given repo and cannot be transferred to Deepstream. The error is the same as mentioned above.

AastaLLL · June 21, 2021, 9:19am

Hi,

Thanks for sharing the model and cache.

We can reproduce this issue internally, and is checking.
Will get back to you late.

AastaLLL · June 28, 2021, 7:55am

Hi,

We change the layer name 000_net into data in calib_yolov3-int8-608.bin.

TRT-7103-EntropyCalibration2
data: 3c010a14
...

And Deepstream can run the model with cache successfully.
Could you also give it a try?

Thanks.

srsjd · June 28, 2021, 8:02am

Of course. I’ll work on it as soon as I can.

Thanks.

srsjd · July 15, 2021, 10:35am

Hi,
It did work. However the accuracy only increased for about 1.2 percent, which means the INT8 quantization still caused about 6 percent accuracy drop. Is there any other way to improve this?

Topic		Replies	Views
INT8 Calibration with DS 6.3 worse than with DS 6.0 DeepStream SDK tensorrt , jetson , deepstream , tensorrt-model-optimizer	20	68	March 10, 2025
Int8 Calibration is not accurate .. see image diff with and without TensorRT	20	2573	January 4, 2021
DeepStream implementation of general YoloV2 and YoloV3 to INT8 precision enginefile DeepStream SDK	38	943	October 12, 2021
How to correctly calibrate a YOLO model in deepstream DeepStream SDK tensorrt	15	2627	October 12, 2021
Possible Solutions to INT64 clamping accuracy drop DeepStream SDK tensorrt	11	499	March 11, 2024
Batch Size Failure in Custom YOLOv3 INT8 DeepStream SDK	5	702	October 12, 2021
YOLOv3 TensorRT model in DeepStream DeepStream SDK	11	1079	October 12, 2021
DeepStream4: deepstream-app -c config_infer_primary_yoloXX_XX.txt giving errors DeepStream SDK	13	1989	October 12, 2021
INT8 Calibration on Yolo model DeepStream SDK	11	2788	October 12, 2021
TensorRT YOLO Int8 on GTX 1080ti TensorRT	9	3939	October 12, 2021

INT8 Yolo model conversion led to accuracy drop in deepstream

Related topics