Multiple classes not detected?

Please provide the following information when requesting support.

• Training spec file

unet_train_resnet_unet_isbi.txt (2.7 KB)

• How to reproduce the issue ?

1.why is the precision, recall etc are nan? didn’t i declare it in the spec file ?

2.And is this the reason i didn’t not have a good result?

i was expecting the train track ,sky vegetation etc should be marked. Are these two question related?

Still checking since there are similar topics.
Could you share the public dataset?

thank you for the reply
i’m using the railsem19 + some cityscapes dataset

Did you ever check if the public dataset’s mask meet the requirement which mentioned in Data Annotation Format - NVIDIA Docs?

UNet expects the images and corresponding masks encoded as images. Each mask image is a single-channel image, where every pixel is assigned an integer value that represents the segmentation class.

so i have turned all the mask image to single-channel by running cv2’s BGR2GRAY. since the cityscapes dataset and railsem19 dataset has different annotations, so i had turned their original single-channel mask image to rgb image so i can recognize it better, changed them into same color (for example: let both dataset’s tree to green) to normalize the dataset , lastly i turned the modified mask dataset to grayscale again.

p.s. changed account due to the reply restriction to the new account

aachen_000000_000019_gtFine_color

so this is one of my mask dataset if this can help…

Which version of tlt(tao) did you run?
TLT 3.0-dp-py3 ?
TLT 3.0-py3 ?
Latest 3.21.08 ?

You can check via “tlt info --verbose” or “tao info --verbose”.

print(“For multi-GPU, change --gpus based on your machine.”)
!tao unet train --gpus=1 --gpu_index=$GPU_INDEX
-e /workspace/tlt-experiments/models/unet/specs/unet_train_resnet_unet_isbi.txt
-r $USER_EXPERIMENT_DIR/isbi_experiment_unpruned
-m $USER_EXPERIMENT_DIR/pretrained_resnet18/pretrained_semantic_segmentation_vresnet18/resnet_18.hdf5
-n model_isbi
-k $KEY

this is my training command
p.s. could you remove my reply restriction please i’m running out of gmail account

Can you share your training command? You run training in terminal or notebook?

I inspect your attached mask label png file but find that it has pixel value as below.

[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
36 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54
55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72
73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90
91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108
109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126
127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144
145 146 147 148 149 150 151 152 153 155 156 157 158 159 163 164 166 167
168 169 170 171 172 173 174 176 177 178 179 180 181 182 183 184 185 186
187 188 189 190 191 192 193 195 196 197 198 199 200 201 202 203 204 205
206 207 208 209 210 220 238]

It does not match your training classes.
Please note that the pixel integer value should be equal to the value of the label_id provided in the spec.
UNet expects the images and corresponding masks encoded as images. Each mask image is a single-channel image, where every pixel is assigned an integer value that represents the segmentation class.
See Data Annotation Format - NVIDIA Docs

More,
See UNET - NVIDIA Docs

Dice loss for multi-class segmentation is not supported.

So for multiple classes, please use “cross_entropy”.

Also, if want to get the highest accuracy, suggest to use the powerful backbone: vanilla_unet_dynamic

1 Like

Thank you so much, truely appreciated

I’ve done what you asked and changed my mask images according to labels id (0~17) but even if i don’t change my backbone to vanilla_unet_dynamic is that normal that i have a precision around 0.25?


my epoch is 25 but still i don’t think the precision should be this low

This is my changed mask image:
aachen_000000_000019_gtFine_color

This is my current spec file:
unet_train_resnet_unet_isbi.txt (2.7 KB)

p.s. when i googled the vanilla_unet_dynamic the only search result is this https://ngc.nvidia.com/catalog/models/nvidia:tlt_peoplesemsegnet. Is this the model that you refered. If so, how can i train it in my current container. Thanks

Your mask label still have below values which is not expected.

[ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55
56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73
74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91
92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109
110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127
128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145
146 147 148 149 150 151 152 153 155 156 157 158 159 163 164 166 167 168
169 170 171 172 173 174 176 177 178 179 180 181 182 183 184 185 186 187
188 189 190 191 192 193 195 196 197 198 199 200 201 202 203 204 205 206
207 208 209 210]

For vanilla_unet_dynamic, please refer to UNET - NVIDIA Docs
the arch supports: resnet, vgg, vanilla_unet, efficientnet_b0, vanilla_dynamic

i do realize that my mask image has certain amount of pixels that is outlining the objects but i currently haven’t come out way to solve the problem, but is that the only reason(or main reason) that cause the low precision?

UNet expects the images and corresponding masks encoded as images. Each mask image is a single-channel image, where every pixel is assigned an integer value that represents the segmentation class.

Also i was confused about numbers between 0~17 didn’t i assigned classes from label 0 to label 17? why did you say it’s not expected?


apart from that i’ll try to change the colors above 17

I have found my problem! Thanks for your help, appreciated

Would you please help to share what the problem is?

Thanks

my cityscapes dataset seems to have color convert problems, so i recolored it and tried it again and it worked

1 Like