Tao unet model outputs only one class

• Hardware : NVIDIA GeForce RTX 2060
• Network Type: Yolo_v4
• TLT Version: 3.22.05
• Training spec file : tao_unet_05_08_24_train_v2.txt (1.7 KB)

I am trying to train a unet model to segment vehicles, there are two classes ‘background’ and ‘foreground’. I started my first model’s training with the default config given in the tao toolkit documentation for unet with some minor changes tao_unet_05_08_24_train.txt (1.7 KB) and the model was only outputting one class as mask and moreover the loss was stuck after the first epoch. So I reffered Problems encountered in training unet and inference unet - #27 by Morganh and made the modifications as per the moderator’s post and I could see better loss propagation output.log (5.0 MB) but the model when testing still gave only one output class. The tao unet evaluate output looked like this "{'foreground': {'precision': 1.0, 'Recall': 1.0, 'F1 Score': 1.0, 'iou': 1.0}, 'background': {'precision': nan, 'Recall': nan, 'F1 Score': nan, 'iou': nan}}".

Some additional context, the training dataset used is coco_2017_train with no pre processing applied to the images and using segmentation for the ‘bus’, ‘truck’ and ‘car’ classes.

Are the mask images correct? For color/ RGB input images, each mask image is a single-channel or three-channel image with the size equal to the input image. Every pixel in the mask should have an integer value that represents the segmentation class label_id.Refer to Data Annotation Format - NVIDIA Docs. Need to assign every pixel to an integer. This integer should be equal to the value of the label_id provided in the spec.

Masks were generated with tao unet dataset-convert command. The resultant masks looks correct.
Image
000000005037-img
Mask
000000005037

Could you trigger an overfitting experiment to narrow down?
You can modify the spec file in order to use part of the training dataset and also use them as the validation dataset.

I check your mask file. Seems that the pixel value is not correct. It does not match training spec file.
In your mask file, it is 0 or 6 and few of 3.

 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6 6 6 6 6 6 6 6 6
  6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6
  6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6
  6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6
  6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6
  6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6
  6 6 6 6 

Can you change to 0 and 1 to match your spec file?

BTW, I visualize your mask as below.
mask

Masks were the issue as you pointed out. Now the model is not just outputting the one class. Still the output is not usable, but I should be able to fix this with training with better hyperparameters.

OK, thanks for the info.