Question about validation loss, ap and over-under fitting (Yolo v4)

KGerry · February 18, 2022, 12:20pm

Please provide the following information when requesting support.

• Hardware (T4/V100/Xavier/Nano/etc)
• Network Type Yolo_v4 - Resnet 18
• TLT Version (latest)
• Training spec file(If have, please share here)
• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.)

As a first timer in training a Yolo v4 Object detector (I’ve trained Detectnet before) I’m a bit confused about the train-loss, validation-loss and (m)ap metric’s. Let me elaborate: During training the train-loss is declining, as expected. After the 10th Epoch it does the first eval. The validation-loss is lower than the training-loss. after about 50 or 60 Epochs, the validation-loss flatlines (as expected), but still is lower than train-loss. The last 20 epochs the validation-loss starts to climb, with indicates overfitting.

In my experience, one should see the validation-loss climb when the model is overfitting. Normally overfitting occurs when train-loss is lower than validation-loss. This is not the case in my situation.
Datasets are very large and nicely distributed. No strange train/val ratio. Maybe it’s the regularization or optimizers?

Which leads me to my question. When selecting an Epoch to use for inference OR retrain, should I choose the epoch with the lowest eval-loss? Or the highest (M)ap?

Morganh · February 18, 2022, 3:20pm

Yes. Please use the best one.

KGerry · February 18, 2022, 3:40pm

As in the one with the highest (m)ap?

Morganh · February 18, 2022, 3:41pm

Yes.

system · March 4, 2022, 3:42pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.