I tried the BDD100K dataset with the new TAO image and it works, as long as the images are converted to png. No resizing was needed, and the labels worked out-of-the-box. Here are my final conclusions regarding UNet:
- Automatic mixed precision must be disabled. Otherwise, the loss function will become NaN and the training will terminate prematurely.
- All input images must be PNG. Otherwise, the loss function value will not decrease and the inference results will be very poor.
- It is recommended to use the new TAO docker image. Otherwise, some non-trivial manual resizing needs to be done for the images and labels.
- The training is extremely sensitive to the spec file values. For example, changing the regularization weight to 2e-06 and setting crop_and_resize_prob to 0.01 will cause the training to fail and terminate prematurely. An example spec file that works can be found from my previous message.