Why mess if do INT8 optimizing with low-rank approximation SSD

Dear AastaLLL,

I send you private message from forum. Please check about it.

Hi,

Sorry for the delay.

We have downloaded your data and is checking internally.
Will update information with you later.

Thanks.

Hi,

Sorry for our not clear explanation.

In comment #20, we prefer a MNIST classification model.
It looks like the model shared in comment #21 is still a detection use case with several plugin layer.

It is not easy for our developer to debug from it since lots of custom code and usage.
Could you help to prepare the data of comment #20?

Thanks.

Hi,

We have tried the package shared in comment #21.
After adding a calibrator for the INT8 mode to inherit the shared calibration table, the result is similar to FP32.

FP32 mode:
0–1, 0.999835, 0.592052, 0.251454, 0.726451, 0.845161
0–1, 0.999792, 0.048442, 0.274386, 0.196504, 0.850645
0–1, 0.999286, 0.24038, 0.265426, 0.375126, 0.8482
0–1, 0.999102, 0.76901, 0.234462, 0.903232, 0.849441
0–1, 0.995542, 0.418688, 0.270036, 0.555362, 0.85595

INT8 mode:
0–1, 0.999491, 0.0494307, 0.275875, 0.195488, 0.847609
0–1, 0.999326, 0.588448, 0.251332, 0.724735, 0.84268
0–1, 0.998038, 0.240397, 0.272721, 0.375841, 0.84005
0–1, 0.99662, 0.769659, 0.246782, 0.90639, 0.84282
0–1, 0.988917, 0.417868, 0.27587, 0.555188, 0.851385

Could you help to recheck it?

Thanks.

Hi,

So you print out the SSD classifier and bbox top 1 score, can I explain the output format as:
<top 1>, score, bbox.xmin, bbox.ymin, bbox.xmax, bbox.ymax.
But 0.592052 is greater than 0.251454 in the first line result of FP32 mode, there is another print out format with you output, would you like to provide it? There is a big span with the first result of FP32 mode and INT8 mode bbox(if it print the bbox), we still need understand number meaning follow by 0.999835.

We’ll recheck our code and result, what’s the images you pick for validation? we need cross check with that.

Thanks.

Hi,

Thanks for your feedback.

The data shown in #24 is the direct output of the given model.
As you said, there are some difference between FP32 and INT8.
Sorry for not noticing that.

We are checking this issue again.
Will update information with you later.

Thanks.

Hi,

We compiled the code in comment #21, run with ./data/1.jpg file and got another result:

FP32 mode:
0–1, 0.998724, 0.178847, 0.269576, 0.283985, 0.855274
0–1, 0.997735, 0.444632, 0.245118, 0.549333, 0.84125
0–1, 0.997612, 0.784388, 0.265487, 0.901109, 0.858277
0–1, 0.99717, 0.0339632, 0.267934, 0.15187, 0.838225
0–1, 0.989945, 0.576447, 0.247021, 0.679591, 0.850348
0–1, 0.978862, 0.310663, 0.245406, 0.416415, 0.824407
0–1, 0.548041, 0.935928, 0.00832144, 1.01391, 0.182722

INT8 mode:
0–1, 0.996268, 0.179394, 0.267203, 0.284639, 0.846807
0–1, 0.99345, 0.781464, 0.256652, 0.900316, 0.852235
0–1, 0.993077, 0.0339053, 0.271945, 0.15262, 0.845425
0–1, 0.988621, 0.447997, 0.263895, 0.548257, 0.842531
0–1, 0.948939, 0.306625, 0.258145, 0.413164, 0.846016
0–1, 0.948452, 0.575053, 0.257146, 0.678382, 0.849395
0–1, 0.687262, 0.936254, 0.00462355, 1.01454, 0.17509

It seems the output order is not retain in different mode, but we can found corresponding item in INT8 mode which appear in FP32 mode.

We have updated our SSD plugin code which included in comment #21, we will recheck our original low-rank model. The attachments is the command line and full log.

Thanks.
int8.log (15.1 KB)
fp32.log (14.5 KB)
test_int8.sh.txt (118 Bytes)
test.sh.txt (93 Bytes)

Hi,

Thanks for sharing this information with us.

Looks like this difference comes from non-maximal suppression.
Suppose you implement it with plugin API, is there any possibility to cause the different ranking order?

Thanks.

Hi,

Today we retrained our original low-rank network with original business dataset, and run the inference with updated TensorRT plugin code + INT8, the result still wrong.
We are trying to scale and padding the dataset in comment #21 from 600x300 to 640x448 to make the training process can share exact the same network, but different weights to confirm which one leading the result to chaos. 

Thanks.

Hi,

We are trying to find the error comes from.

There are lots of parameters in the plugin implementation.
Is tuning required for these parameters when INT8 is enabled?

Thanks.

Hi,

We also noticed that, and trying to use some public dataset to train the orignal SSD, and keep the plugin parameter same.
We haven’t changed these parameters when using other optimized and INT8 enabled with SSD network.

Thanks.

Thanks for your feedback.
Please also let us know the results.

Hi,

We had train and tune low rank with a [b]public dataset[/b] which we resampled to 3 classes(bike, dog, car) at these days, the [b]SSD + Low-Rank + INT8 detection is FINE[/b].

Since the last network is just same, only different with weight files, as I do inference with one .prototxt, but different dataset and .caffemodel files, the error root cause come from other things. 

Thanks.

Hi,

Although we still have no idea about the error, is your issue fixed already?
Thanks.

Hi,

We trained SSD + Low-Rank + INT8 in selected VOC dataset (bike, dog, car) and it detect well both in FP32 and int8 as comment #33 says.While we train it in our own dataset with the same deploy_gie.prototxt,detection result is FINE in FP32 but FALSE in int8.
What’s more important is that we find the int8 detection result of our-dataset edition (lowrank_int8_our) becomes right when we change the calibration file (lowrank_int8_our/data/int8_calib.txt) to the one in VOC-dataset edition (lowrank_int8_voc/data/int8_calib.txt).

Maybe something wrong when the calibration meet our model or dataset. We provide you the whole dataset,model,log and code by private message from forum,hope it can be useful for you to locate the problem and help us fix it.

Thanks very much.

Hi,

We are sorry for keeping you waiting.
From the comment #33, we thought the block of your project is fixed and stopped to track this issue.

Before further investigation, we want to summarize the experiments we have currently:

|       |  Training  |  Calibration  |  Result(INT8) |
------------------------------------------------------
| EXP-1 |    VOC     |      VOC      |      OK       |
| EXP-2 |    OWN     |      <b>OWN</b>      |      NO       |
| EXP-3 |    OWN     |      <b>VOC</b>      |      OK       |

It this table is correct, your issue should comes from INT8 calibration with own database.

If you also agress with this, could you share the image amount and calibration log for both own and VOC database.
If these information is already included in the private message, please just skip it.

Thanks.

Hi,

We just can’t duplicate the failure at comment #33, and finally we prepare our own dataset with left 1000 images, we generate the calibration with random selecting 100 images from them with batch of 8.

Thanks.

Hi,@AastaLLL
The table is correct and we have provided the image data and calibration file in the share of private message.Just like haifengli said, we generate the calibration with random selecting 100 images from them with batch of 8.Hope these information is useful for you to duplicate the experiments.
Thanks.

Hi,
We find that the wrong result of EXP-2 in table of the comment #33 is a mistake. We used an wrong Calibration file ,which is not generated by this experiment. While we changing it to the right Calibration file, it works fine. So the last time we got wrong result may also be used wrong Calibration file, we’ll try it again to recheck it.May trouble you again if the failure duplicated.

Thanks.

Hi,

We are checking the file you shared internally.
Will update information with you later.

Thanks.