TAO unet producing nan values

Please provide the following information when requesting support.

RTX3090 Ubuntu 18.04
TAO: dockers: [‘nvidia/tao/tao-toolkit-tf’, ‘nvidia/tao/tao-toolkit-pyt’, ‘nvidia/tao/tao-toolkit-lm’] format_version: 2.0 toolkit_version: 3.22.02 published_date: 02/28/2022

spec file unet_train_resnet_unet_6S.txt (1.4 KB)

Images and masks inside data.zip can be found here: github.com/DaveBGld/TAO.git

To reproduce the problem, in the cv_samples_v1.3.0/unet/unet_isbi.ipynb example, replace the spec file contet and the data folders with the above.

After running tao unet evaluate the results_tlt.json is:

“{‘Background’: {‘precision’: 0.97684205, ‘Recall’: 1.0, ‘F1 Score’: 0.9882853795702129, ‘iou’: 0.97684205}, ‘Plant2’: {‘precision’: nan, ‘Recall’: 0.0, ‘F1 Score’: nan, ‘iou’: 0.0}, ‘Leaf’: {‘precision’: nan, ‘Recall’: 0.0, ‘F1 Score’: nan, ‘iou’: 0.0}}”

These are color images in PNG files, 512X512, with masks also in PNG files 512X512, where ALL pixels in the masks fall under {0, 1, 2} for {Background, Plant2, Leaf}

Following Morganh advice in this post Fail with Transfer Learning with Unet Multiclass, Color Images, Semantic Segmentation - #18 by david9xqqb I have in the specs

loss: “cross_entropy”
weight: 2e-06
crop_and_resize_prob : 0.01

Please see my first reply for additional info (did not allow me more than 3 links!!!)

Clueless on how to proceed

Many thanks!

Running unet isbi example notebook with this data and spec produces this training output 2022 04 04.txt (76.1 KB) output in the training step, and in the evaluate step evaluate output 2022 04 04.txt (41.6 KB) this output.

And also tried following laurim’s advice here Training multi-class UNet does not converge - #31 by laurim? to set weight to 0. and removed the augmentation_config section to same results.

Also tried recreating the masks such that the background label is 2 and plant and leaf as 0 and 1 with same results.

Please check if your mask images are single-channel images, where every pixel is assigned an integer value that represents the segmentation class.
See Data Annotation Format — TAO Toolkit 3.0 documentation

And also Multiple classes not detected? - #11 by Morganh

1 Like

Thanks!!! That did it

I appreciate your attention to detail and speed in responding! I commend you and thank you (I have a post on the deepstream forum that has not been replied to in over two months and you provided a solution in less than an hour!!!).

You’re welcome.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.