0 mAP during lpd FineTuning with custom dataset

Please provide the following information when requesting support.

• Hardware Nvidia’s AWS Tao configured AMI
• Network Type Detectnet_v2
• TAO 3.21.08

platesNet_tfrecords_kitti_train.txt (327 Bytes)
platesNet_train_resnet18_kitti.txt (3.1 KB)
trainingLog.txt (486.0 KB)

Attached is the training log showing 0 mAP during all the training process over 500 epochs.

There is no error associated with either the tfRecords creation or the evaluation (of non finetuned model) step.

I’m using the usa_unpruned.tlt model of lpdNet:

!ngc registry model download-version nvidia/tao/lpdnet:unpruned_v1.0

You are training against 297 images.
Can you try with lower batch-size?

Morganh. thanks for replying, last night, testing different changes on the spec file reduced the batch size to 32. “Suddenly” it worked. So indeed that’s the solution to this problem. Now that you suggest that as the first action to take. Is there a reason why X amount of images requires Y batch-size?

See Frequently Asked Questions — TAO Toolkit 3.0 documentation

Is there a dependency of batch size on the accuracy of the model? How should I choose the appropriate batch size for my training?

As a common practice, a small batch size or single GPU is preferred for a small dataset; while a large batch size or multiple GPUs is preferred for a large dataset.

Great! Thanks for your help!!