Question about validation loss, ap and over-under fitting (Yolo v4)

Please provide the following information when requesting support.

• Hardware (T4/V100/Xavier/Nano/etc)
• Network Type Yolo_v4 - Resnet 18
• TLT Version (latest)
• Training spec file(If have, please share here)
• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.)

As a first timer in training a Yolo v4 Object detector (I’ve trained Detectnet before) I’m a bit confused about the train-loss, validation-loss and (m)ap metric’s. Let me elaborate: During training the train-loss is declining, as expected. After the 10th Epoch it does the first eval. The validation-loss is lower than the training-loss. after about 50 or 60 Epochs, the validation-loss flatlines (as expected), but still is lower than train-loss. The last 20 epochs the validation-loss starts to climb, with indicates overfitting.

In my experience, one should see the validation-loss climb when the model is overfitting. Normally overfitting occurs when train-loss is lower than validation-loss. This is not the case in my situation.
Datasets are very large and nicely distributed. No strange train/val ratio. Maybe it’s the regularization or optimizers?

Which leads me to my question. When selecting an Epoch to use for inference OR retrain, should I choose the epoch with the lowest eval-loss? Or the highest (M)ap?

Yes. Please use the best one.

As in the one with the highest (m)ap?

Yes.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.