I have run numerous experiments using a 2 class custom dataset and have fine-tuned the maximum learning rate, soft start and coverage threshold. In all experiments the training achieves the best results in 20 epochs at most and extending the training beyond this causes a lower performance. I have also found that starting annealing as late as possible (0.95) keeps the performance higher throughout training.
This seems to run contrary to examples in the documentation, where training experiments of 60 - 100 epochs are mentioned.
Do you have any suggestions as to how I can extend training and improve the mAP. I currently have a best result of 92.6% and this was achieved on the 20th (final) epoch. An earlier experiment achieved around 70% mAP at 30 epochs and by 80 epochs it was down to around 55.6%. \this is what prompted me to focus in on the earlier epochs.
I had to rerun the training, but it gave a pretty similar result (91.3% at epoch 19 this time vs. 92.6% at epoch 20 last time). detectnet_v2_resnet50-20 epoch.pdf (215.0 KB)
I think you are running the experiments which is similar to your old topic Detectnet_v2(resnet50) low accuracy on 2 class dataset. For your training images and labels, I remember that you label the whole images as damage or healthy. If that is the case, actually you can train with classification network instead of detection network.
If you decide the use detection network, I suggest you to label to the “damage” class where the area is really damage. And label to the “healthy” class where is really healthy.
Okay, thanks for that, that means ‘deleting’ half the database.
I will run more experiments.
I can confirm that the first experiment produced a better mAP for damage. The main aim will be for performance to improve over more epochs.
I am following up on this to say that through a combination of the larger image size of 1216 x 480 pixels and slightly reducing the soft start of the training_config from the example given in the documentation, a 50 epoch experiment yielded results of 95.1 mAP at 14 epoch, 93.3 at 30 epoch and 93.1 at 43 epoch.
Experiments of 60 epochs have so far yielded nothing better than 92.4, though this was achieved at 50/60 epoch. I guess this is a solution, just not quite as definitive as one might have hoped for.