Hi,
I tried to train yolov3 with darknet-53 pre-trained backbone on the VOC07+12 dataset, the mAP (INTEGRATE) score is only about 57% after 120 epochs, which has a significant discrepancy compared with benchmark (about 82% mAP). And the loss dropped at about 3 after 120 epochs. Does anyone has the similar issue or some clues about my problem? I’ve attached my config file below. Thanks!
To improve mAP, replacing pretrained model can be an option. For example, use tlt classification network to train a pretrained tlt model with Imagenet dataset.
Thanks a lot for your sharing! By the way, I would like to ask more details about this experiment and training configurations in order to improve mine:
For the ‘regularizer’, what would be the difference between L1 and L2 (I followed the official TLT guide which recommended L1), and would that have some impact on accuracy?
For the ‘annealing’ point, my choice was a little bite later, like 0.7 (actually I’m not familiar with this training schedule), are there any common rules for choosing this value?
How did you get those anchors’ sizes for Yolo? I tired the kmeans.py provided in the yolo example of TLT, however I got different anchors than yours (ps: my dataset is voc07+12 resized on 416*416)
Thank you!
Actually I did not trigger more experiments to finetune all the parameters. I just copy one of my old training spec of yolo_v3 and start training.
For 1), L1 training is easier for pruning. I think you can keep your L2 setting. It should not have significant impact on accuracy. If you have time, you can trigger experiments for both L1 and L2.
For 2), For annealing, you can check tlt-user guide https://pgambrill.gitlab-master-pages.nvidia.com/tlt-docs/text/creating_experiment_spec.html#specification-file-for-detectnet-v2 . There is not common rules. If set to 0.7, the annealing time is shorter, the time which stays at max_lr is longer. If set to 0.5, the annealing time is longer, the time which stays at max_lr is shorter. You can trigger experiments to check which is better for your case.
For 3), I think this have more impact on accuracy. Yes, I use kmeans.py, but I run it against voc07 training dataset along with voc2012 training dataset. I copy the two dataset into one folder, all is resized to 416x416.
More, I want to highlight another difference. I set different val dataset. I use voc07 test datset.
Thanks you for your detailed answer! I will try more experiments with different configs to see if I could get better result. Besides, is there any chance you could share the training log of your experiment, please? I would like to compare how the loss varies.