Nvidia TAO Maskrcnn training problem. Training is complete but Evaluation Metrics are all 0, and can not achieve right mask on inferences

I spent a long time to figure out how to run Nvidia TAO Maskrcnn training. (nvidia-tao/maskrcnn.ipynb at main · NVIDIA-AI-IOT/nvidia-tao · GitHub)

And finally, the training is complete and “[INFO] Training finished successfully” is displayed. But Evaluation Metrics are all 0, and can not achieve right mask on inferences.

• Hardware : Running TAO Toolkit on Google Colab
• Network Type : Mask_rcnn
• Training spec file :
maskrcnn_train_resnet50.txt (2.1 KB)

• How to reproduce the issue :
train_log (330.1 KB)
Generate_tfrecords_log (1.4 KB)
enviroment_setting_log (988 Bytes)

Please Help ! Plenty of thanks in advance !!!

Make sure the `num_classes" is correct.
Refer to MaskRCNN - NVIDIA Docs

The number of classes. If there are N categories in the annotation, num_classes should be N+1 (background class)

I am sure the `num_classes" is correct.

( I have 3 class totally, and I set num_classes = 4)

Please enlarge below parameter.
total_steps: 25000

You only set it to 10.

The reason why I set it to 10, is because I just use 20 training pictures to try to run the demo successfully first, afterwards, I will enlarge the amount of training pictures.

So do you think I still need to enlarge total_steps to this kind of big?

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

Yes, please change it for new training.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.