There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks
Hi,
When you run training inside the 5.0 docker, please change /usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/unet/scripts/train.py line199~207 as below.
It will fix the issue.
# Initialize env for AMP training
if params.use_amp:
os.environ['TF_ENABLE_AUTO_MIXED_PRECISION'] = '1'
# Enable automatic loss scaling
os.environ["TF_ENABLE_AUTO_MIXED_PRECISION_LOSS_SCALING"] = '1'
else:
os.environ['TF_ENABLE_AUTO_MIXED_PRECISION'] = '0'
Thanks.