Accuracy and mIoU of 1.0 when validating Mask2Former

Hardware: RTX3080Ti
Network: Mask2Former
Docker image: nvcr.io/nvidia/tao/tao-toolkit:5.5.0-pyt

Spec file for training and validation:
exp_mask2former.txt (2.1 KB)

Issue:
During validation of the Mask2Former, the accuracy and mIoU metrics are always 1.0. This is obviously incorrect and should be lower. The issue occurs when validating on coco panoptic as well as coco instance annotations.

Troubleshooting:
Taking a look at the source code of the TAO pytorch backend, it looks like the dataset classes (tao_pytorch_backend/nvidia_tao_pytorch/cv/mask2former/dataloader/datasets.py at main · NVIDIA/tao_pytorch_backend · GitHub) used for coco always convert the segmentations to a semantic segmentation map.

Also, the predicted segmentation map passed to calculate the evaluation metrics always seem to be 0 in the validation_step() method in the pytorch lightning model (tao_pytorch_backend/nvidia_tao_pytorch/cv/mask2former/model/pl_model.py at main · NVIDIA/tao_pytorch_backend · GitHub).

Is there a way to fix the evaluation for the Mask2Former model for instance segmentation?

Can you run successfully with default notebook/dataset successfully?
See tao_tutorials/notebooks/tao_launcher_starter_kit/mask2former/mask2former.ipynb at main · NVIDIA/tao_tutorials · GitHub
and
tao_tutorials/notebooks/tao_launcher_starter_kit/mask2former/specs/spec.yaml at main · NVIDIA/tao_tutorials · GitHub.

Please note that there are 2 kinds of notebooks as well. tao_tutorials/notebooks/tao_launcher_starter_kit/mask2former at main · NVIDIA/tao_tutorials · GitHub.

Thank you for your reply!

I ran the instance segmentation tutorial notebook (mask2former_inst.ipynb). I changed the batch size and number of workers for training and trained on the validation set to just speed up the process. See the spec file here:
spec_inst.txt (1.6 KB)

From this I got the following metrics:

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃        Test metric        ┃       DataLoader 0        ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│          all_acc          │    0.3793853521347046     │
│           mIoU            │    0.01788681373000145    │
│         val_loss          │    62.049774169921875     │
└───────────────────────────┴───────────────────────────┘

So it seems like it is working correctly.

I adjusted the tutorial to use my own custom coco dataset, but again I got an accuracy and mIoU of 1.0 as seen below:

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃        Test metric        ┃       DataLoader 0        ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│          all_acc          │            1.0            │
│           mIoU            │            1.0            │
│         val_loss          │    59.251399993896484     │
└───────────────────────────┴───────────────────────────┘

The spec file used:
spec_inst_apples.txt (1.6 KB)

I then tested some other custom coco datasets available online. From this it seems like the problem only occurs when the number of classes is 1. I adjusted my custom apples dataset such that it had two classes and changed the annotations to have roughly half of both classes. I ran the same training, but now with only the number of classes adjusted:
spec_inst_apples2.txt (1.6 KB)

This produced an accuracy and mIoU that is not 1.0:

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃        Test metric        ┃       DataLoader 0        ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│          all_acc          │    0.6408777832984924     │
│           mIoU            │    0.3204388916492462     │
│         val_loss          │     70.84131622314453     │
└───────────────────────────┴───────────────────────────┘

Please follow the default notebook to train and run inference to confirm it is working. The training epoch is set to 50 by default. Your setting(only train for 1 epoch) is not enough.

Hello Morgan,

Thank you for your reply.

I do not care so much about how high the accuracy and the mIoU of the model that is produced by the tutorial notebook. I want to train a model on a custom dataset, but it seems like that the evaluation script does not yield correct results when the num_classes is set to 1 in the config file, because the accuracy and mIoU is always 1.0. As per my previous post, the default notebook does yield correct accuracies and mIoU (although not high). By changing my custom dataset to two classes, the validation does work correctly, however this is not desired.

I am seeking to validate my custom trained model which only has to predict a single class, instead of validating the standard model produced by the default notebook. Can you help me with this?

Can you enlarge more training epochs to check if it works? I am afraid the training does not converge yet.
I will also check further if it supports only class 1.

It can support running with only 1 class.
Please ensure “the category ids and annotation ids must be greater than 0.” mask2former - NVIDIA Docs. Thanks.
If possible, please share the minimal dataset to us to reproduce as well.