Training from scratch using TAO for maskrcnn

• Hardware (A30)
• Network Type (Mask_rcnn)
• TLT Version (/tao-toolkit-tf:v3.22.05-tf1.15.5-py3)
• Training spec file(If have, please share here)
maskrcnn_train_resnet18.txt (2.1 KB)
I have two queries.
(1)If I like to train mrcnn from scratch. What do I need to change in config file?
(2)My images are fisheye images and like to train for mrcnn.
If I train using pretrained model, what are the config parameters to take care for good accuracy?

  1. Can set checkpoint to “”.
  2. Can refer to Poor metric results after retraining maskrcnn using TLT notebook - #17 by Morganh

Sure thanks. I’ll train scratch first

I am using the following config file. What parameters should I change?
The training accuracy is quite bad. I am training from scratch for fisheye images.

maskrcnn_train_resnet18.txt (2.2 KB)

I used same dataset and trained from scratch using mmdetection’s maskrcnn, the detection accuracy is quite good.

DLL 2023-01-04 11:17:25.504965 - Iteration: 4830 Validation Iteration: 4830  AP : 0.00856037437915802
DLL 2023-01-04 11:17:25.505199 - Iteration: 4830 Validation Iteration: 4830  AP50 : 0.020551249384880066
DLL 2023-01-04 11:17:25.505258 - Iteration: 4830 Validation Iteration: 4830  AP75 : 0.00629538856446743
DLL 2023-01-04 11:17:25.505304 - Iteration: 4830 Validation Iteration: 4830  APs : 0.0
DLL 2023-01-04 11:17:25.505350 - Iteration: 4830 Validation Iteration: 4830  APm : 0.0002463551063556224
DLL 2023-01-04 11:17:25.505398 - Iteration: 4830 Validation Iteration: 4830  APl : 0.009035307914018631
DLL 2023-01-04 11:17:25.505441 - Iteration: 4830 Validation Iteration: 4830  ARmax1 : 0.04031337797641754
DLL 2023-01-04 11:17:25.505482 - Iteration: 4830 Validation Iteration: 4830  ARmax10 : 0.09632998704910278
DLL 2023-01-04 11:17:25.505522 - Iteration: 4830 Validation Iteration: 4830  ARmax100 : 0.10416806489229202
DLL 2023-01-04 11:17:25.505565 - Iteration: 4830 Validation Iteration: 4830  ARs : 0.0
DLL 2023-01-04 11:17:25.505605 - Iteration: 4830 Validation Iteration: 4830  ARm : 0.0010416667209938169
DLL 2023-01-04 11:17:25.505664 - Iteration: 4830 Validation Iteration: 4830  ARl : 0.11147881299257278
DLL 2023-01-04 11:17:25.505706 - Iteration: 4830 Validation Iteration: 4830  mask_AP : 0.0009729214361868799
DLL 2023-01-04 11:17:25.505747 - Iteration: 4830 Validation Iteration: 4830  mask_AP50 : 0.003719929838553071
DLL 2023-01-04 11:17:25.505787 - Iteration: 4830 Validation Iteration: 4830  mask_AP75 : 9.244621651305351e-06
DLL 2023-01-04 11:17:25.505832 - Iteration: 4830 Validation Iteration: 4830  mask_APs : 0.0
DLL 2023-01-04 11:17:25.505879 - Iteration: 4830 Validation Iteration: 4830  mask_APm : 0.0001544429687783122
DLL 2023-01-04 11:17:25.505924 - Iteration: 4830 Validation Iteration: 4830  mask_APl : 0.0009968809317797422
DLL 2023-01-04 11:17:25.505964 - Iteration: 4830 Validation Iteration: 4830  mask_ARmax1 : 0.0026150394696742296
DLL 2023-01-04 11:17:25.506003 - Iteration: 4830 Validation Iteration: 4830  mask_ARmax10 : 0.006770129315555096
DLL 2023-01-04 11:17:25.506047 - Iteration: 4830 Validation Iteration: 4830  mask_ARmax100 : 0.007635500282049179
DLL 2023-01-04 11:17:25.506087 - Iteration: 4830 Validation Iteration: 4830  mask_ARs : 0.0
DLL 2023-01-04 11:17:25.506124 - Iteration: 4830 Validation Iteration: 4830  mask_ARm : 0.00244633830152452
DLL 2023-01-04 11:17:25.506164 - Iteration: 4830 Validation Iteration: 4830  mask_ARl : 0.007796799764037132

You can check the log mentioned in Poor metric results after retraining maskrcnn using TLT notebook - #20 by Morganh. It was training against the COCO dataset with pretrained model.

Yes that log is way much better than my log. Which parameter should i change? I am training from scratch.

The part of corresponding spec file is Poor metric results after retraining maskrcnn using TLT notebook - #17 by Morganh

I am training from scratch. so I shouldn’t freeze layers and batchnorm, right?
These two

#freeze_bn: True
 #freeze_blocks: "[0,1]"

Yes, that is right.

More, it is suggested to run comparable experiments.

  • Training from scratch
  • Train with pretrained model

Now my result getting better in training from scratch.
I’ll discuss what I did when completed.

The attached file is my config file and training from scratch for fisheye images.
Achieved good accuracy in training.
maskrcnn_train_resnet50.txt (2.4 KB)

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.