Run TAO training using unet.ipynb in Jupyter Notebook failed

I’m running open model architecture-UNET using Jupyter Notebook-unet/unet.ipynb.
When I run the “tao unet train” cell, it failed, details show below.
Why may this happened?
How can I fix it?

For multi-GPU, change --gpus based on your machine.
2022-07-14 14:31:01,359 [INFO] root: Registry: ['nvcr.io']
2022-07-14 14:31:01,486 [WARNING] tlt.components.docker_handler.docker_handler: 
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/home/ubuntu/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
Using TensorFlow backend.
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
Using TensorFlow backend.
WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/unet/hooks/checkpoint_saver_hook.py:21: The name tf.train.CheckpointSaverHook is deprecated. Please use tf.estimator.CheckpointSaverHook instead.

WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/unet/hooks/pretrained_restore_hook.py:23: The name tf.logging.set_verbosity is deprecated. Please use tf.compat.v1.logging.set_verbosity instead.

WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/unet/hooks/pretrained_restore_hook.py:23: The name tf.logging.WARN is deprecated. Please use tf.compat.v1.logging.WARN instead.

WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/unet/scripts/train.py:405: The name tf.logging.INFO is deprecated. Please use tf.compat.v1.logging.INFO instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/horovod/tensorflow/__init__.py:117: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/horovod/tensorflow/__init__.py:143: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.

Loading experiment spec at /home/ubuntu/Desktop/new_model/tao-experiment/unet/specs/unet_train_resnet_unet_isbi.txt.
2022-07-14 06:31:07,981 [INFO] __main__: Loading experiment spec at /home/ubuntu/Desktop/new_model/tao-experiment/unet/specs/unet_train_resnet_unet_isbi.txt.
2022-07-14 06:31:07,983 [INFO] iva.unet.spec_handler.spec_loader: Merging specification from /home/ubuntu/Desktop/new_model/tao-experiment/unet/specs/unet_train_resnet_unet_isbi.txt
2022-07-14 06:31:07,985 [INFO] root: Initializing the pre-trained weights from /home/ubuntu/Desktop/new_model/tao-experiment/unet/pretrained_resnet18/pretrained_semantic_segmentation_vresnet18/resnet_18.hdf5
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:517: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

2022-07-14 06:31:07,988 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:517: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4138: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.

2022-07-14 06:31:07,999 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4138: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:1834: The name tf.nn.fused_batch_norm is deprecated. Please use tf.compat.v1.nn.fused_batch_norm instead.

2022-07-14 06:31:08,020 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:1834: The name tf.nn.fused_batch_norm is deprecated. Please use tf.compat.v1.nn.fused_batch_norm instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:133: The name tf.placeholder_with_default is deprecated. Please use tf.compat.v1.placeholder_with_default instead.

2022-07-14 06:31:08,025 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:133: The name tf.placeholder_with_default is deprecated. Please use tf.compat.v1.placeholder_with_default instead.

WARNING:tensorflow:From /opt/nvidia/third_party/keras/tensorflow_backend.py:187: The name tf.nn.avg_pool is deprecated. Please use tf.nn.avg_pool2d instead.

2022-07-14 06:31:08,864 [WARNING] tensorflow: From /opt/nvidia/third_party/keras/tensorflow_backend.py:187: The name tf.nn.avg_pool is deprecated. Please use tf.nn.avg_pool2d instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:174: The name tf.get_default_session is deprecated. Please use tf.compat.v1.get_default_session instead.

2022-07-14 06:31:09,066 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:174: The name tf.get_default_session is deprecated. Please use tf.compat.v1.get_default_session instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:199: The name tf.is_variable_initialized is deprecated. Please use tf.compat.v1.is_variable_initialized instead.

2022-07-14 06:31:09,066 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:199: The name tf.is_variable_initialized is deprecated. Please use tf.compat.v1.is_variable_initialized instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:206: The name tf.variables_initializer is deprecated. Please use tf.compat.v1.variables_initializer instead.

2022-07-14 06:31:09,234 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:206: The name tf.variables_initializer is deprecated. Please use tf.compat.v1.variables_initializer instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:95: The name tf.reset_default_graph is deprecated. Please use tf.compat.v1.reset_default_graph instead.

2022-07-14 06:31:09,716 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:95: The name tf.reset_default_graph is deprecated. Please use tf.compat.v1.reset_default_graph instead.

2022-07-14 06:31:09,735 [INFO] iva.unet.model.utilities: Label Id 0: Train Id 0
2022-07-14 06:31:09,735 [INFO] iva.unet.model.utilities: Label Id 1: Train Id 1
INFO:tensorflow:Using config: {'_model_dir': '/home/ubuntu/Desktop/new_model/tao-experiment/unet/isbi_experiment_unpruned', '_tf_random_seed': None, '_save_summary_steps': 5, '_save_checkpoints_steps': None, '_save_checkpoints_secs': None, '_session_config': intra_op_parallelism_threads: 1
inter_op_parallelism_threads: 38
gpu_options {
  allow_growth: true
  visible_device_list: "0"
  force_gpu_compatible: true
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': None, '_log_step_count_steps': None, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f52fe8ca908>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
2022-07-14 06:31:09,736 [INFO] tensorflow: Using config: {'_model_dir': '/home/ubuntu/Desktop/new_model/tao-experiment/unet/isbi_experiment_unpruned', '_tf_random_seed': None, '_save_summary_steps': 5, '_save_checkpoints_steps': None, '_save_checkpoints_secs': None, '_session_config': intra_op_parallelism_threads: 1
inter_op_parallelism_threads: 38
gpu_options {
  allow_growth: true
  visible_device_list: "0"
  force_gpu_compatible: true
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': None, '_log_step_count_steps': None, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f52fe8ca908>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}

Phase train: Total 20 files.
2022-07-14 06:31:09,747 [INFO] iva.unet.model.utilities: The total number of training samples 20 and the batch size per                 GPU 3
2022-07-14 06:31:09,747 [INFO] iva.unet.model.utilities: Cannot iterate over exactly 20 samples with a batch size of 3; each epoch will therefore take one extra step.
2022-07-14 06:31:09,747 [INFO] iva.unet.model.utilities: Steps per epoch taken: 7
Running for 50 Epochs
2022-07-14 06:31:09,747 [INFO] __main__: Running for 50 Epochs
INFO:tensorflow:Create CheckpointSaverHook.
2022-07-14 06:31:09,747 [INFO] tensorflow: Create CheckpointSaverHook.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/autograph/converters/directives.py:119: The name tf.set_random_seed is deprecated. Please use tf.compat.v1.set_random_seed instead.

2022-07-14 06:31:10,563 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/autograph/converters/directives.py:119: The name tf.set_random_seed is deprecated. Please use tf.compat.v1.set_random_seed instead.

WARNING:tensorflow:Entity <bound method Dataset.read_image_and_label_tensors of <iva.unet.utils.data_loader.Dataset object at 0x7f52fe8ca860>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <bound method Dataset.read_image_and_label_tensors of <iva.unet.utils.data_loader.Dataset object at 0x7f52fe8ca860>>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
2022-07-14 06:31:10,616 [WARNING] tensorflow: Entity <bound method Dataset.read_image_and_label_tensors of <iva.unet.utils.data_loader.Dataset object at 0x7f52fe8ca860>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <bound method Dataset.read_image_and_label_tensors of <iva.unet.utils.data_loader.Dataset object at 0x7f52fe8ca860>>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
WARNING:tensorflow:Entity <function Dataset.input_fn_aigs_tf.<locals>.<lambda> at 0x7f528ff46e18> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <function Dataset.input_fn_aigs_tf.<locals>.<lambda> at 0x7f528ff46e18>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
2022-07-14 06:31:10,636 [WARNING] tensorflow: Entity <function Dataset.input_fn_aigs_tf.<locals>.<lambda> at 0x7f528ff46e18> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <function Dataset.input_fn_aigs_tf.<locals>.<lambda> at 0x7f528ff46e18>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
WARNING:tensorflow:Entity <bound method Dataset.rgb_to_bgr_tf of <iva.unet.utils.data_loader.Dataset object at 0x7f52fe8ca860>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <bound method Dataset.rgb_to_bgr_tf of <iva.unet.utils.data_loader.Dataset object at 0x7f52fe8ca860>>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
2022-07-14 06:31:10,644 [WARNING] tensorflow: Entity <bound method Dataset.rgb_to_bgr_tf of <iva.unet.utils.data_loader.Dataset object at 0x7f52fe8ca860>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <bound method Dataset.rgb_to_bgr_tf of <iva.unet.utils.data_loader.Dataset object at 0x7f52fe8ca860>>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
WARNING:tensorflow:Entity <bound method Dataset.cast_img_lbl_dtype_tf of <iva.unet.utils.data_loader.Dataset object at 0x7f52fe8ca860>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <bound method Dataset.cast_img_lbl_dtype_tf of <iva.unet.utils.data_loader.Dataset object at 0x7f52fe8ca860>>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
2022-07-14 06:31:10,654 [WARNING] tensorflow: Entity <bound method Dataset.cast_img_lbl_dtype_tf of <iva.unet.utils.data_loader.Dataset object at 0x7f52fe8ca860>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <bound method Dataset.cast_img_lbl_dtype_tf of <iva.unet.utils.data_loader.Dataset object at 0x7f52fe8ca860>>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
WARNING:tensorflow:Entity <bound method Dataset.resize_image_and_label_tf of <iva.unet.utils.data_loader.Dataset object at 0x7f52fe8ca860>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <bound method Dataset.resize_image_and_label_tf of <iva.unet.utils.data_loader.Dataset object at 0x7f52fe8ca860>>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
2022-07-14 06:31:10,663 [WARNING] tensorflow: Entity <bound method Dataset.resize_image_and_label_tf of <iva.unet.utils.data_loader.Dataset object at 0x7f52fe8ca860>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <bound method Dataset.resize_image_and_label_tf of <iva.unet.utils.data_loader.Dataset object at 0x7f52fe8ca860>>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/unet/utils/data_loader.py:414: The name tf.image.resize_images is deprecated. Please use tf.image.resize instead.

2022-07-14 06:31:10,663 [WARNING] tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/unet/utils/data_loader.py:414: The name tf.image.resize_images is deprecated. Please use tf.image.resize instead.

WARNING:tensorflow:Entity <function Dataset.input_fn_aigs_tf.<locals>.<lambda> at 0x7f515397d2f0> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <function Dataset.input_fn_aigs_tf.<locals>.<lambda> at 0x7f515397d2f0>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
2022-07-14 06:31:10,676 [WARNING] tensorflow: Entity <function Dataset.input_fn_aigs_tf.<locals>.<lambda> at 0x7f515397d2f0> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <function Dataset.input_fn_aigs_tf.<locals>.<lambda> at 0x7f515397d2f0>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
WARNING:tensorflow:Entity <function Dataset.input_fn_aigs_tf.<locals>.<lambda> at 0x7f515397dbf8> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <function Dataset.input_fn_aigs_tf.<locals>.<lambda> at 0x7f515397dbf8>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
2022-07-14 06:31:10,684 [WARNING] tensorflow: Entity <function Dataset.input_fn_aigs_tf.<locals>.<lambda> at 0x7f515397dbf8> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <function Dataset.input_fn_aigs_tf.<locals>.<lambda> at 0x7f515397dbf8>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
WARNING:tensorflow:Entity <bound method Dataset.transpose_to_nchw of <iva.unet.utils.data_loader.Dataset object at 0x7f52fe8ca860>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <bound method Dataset.transpose_to_nchw of <iva.unet.utils.data_loader.Dataset object at 0x7f52fe8ca860>>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
2022-07-14 06:31:10,692 [WARNING] tensorflow: Entity <bound method Dataset.transpose_to_nchw of <iva.unet.utils.data_loader.Dataset object at 0x7f52fe8ca860>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <bound method Dataset.transpose_to_nchw of <iva.unet.utils.data_loader.Dataset object at 0x7f52fe8ca860>>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
WARNING:tensorflow:Entity <function Dataset.input_fn_aigs_tf.<locals>.<lambda> at 0x7f515397dd08> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <function Dataset.input_fn_aigs_tf.<locals>.<lambda> at 0x7f515397dd08>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
2022-07-14 06:31:10,707 [WARNING] tensorflow: Entity <function Dataset.input_fn_aigs_tf.<locals>.<lambda> at 0x7f515397dd08> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <function Dataset.input_fn_aigs_tf.<locals>.<lambda> at 0x7f515397dd08>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
INFO:tensorflow:Calling model_fn.
2022-07-14 06:31:10,730 [INFO] tensorflow: Calling model_fn.
2022-07-14 06:31:10,730 [INFO] iva.unet.utils.model_fn: {'exec_mode': 'train', 'model_dir': '/home/ubuntu/Desktop/new_model/tao-experiment/unet/isbi_experiment_unpruned', 'resize_padding': False, 'resize_method': 'BILINEAR', 'log_dir': None, 'batch_size': 3, 'learning_rate': 9.999999747378752e-05, 'crossvalidation_idx': None, 'max_steps': None, 'regularizer_type': 2, 'weight_decay': 1.9999999494757503e-05, 'log_summary_steps': 10, 'warmup_steps': 0, 'augment': False, 'use_amp': False, 'use_trt': False, 'use_xla': False, 'loss': 'cross_dice_sum', 'epochs': 50, 'pretrained_weights_file': None, 'unet_model': <iva.unet.model.resnet_unet.ResnetUnet object at 0x7f528ff77e80>, 'key': 'nvidia_tlt', 'experiment_spec': random_seed: 42
dataset_config {
  dataset: "custom"
  input_image_type: "grayscale"
  train_images_path: "/home/ubuntu/Desktop/new_model/tao-experiment/data/isbi/images/train"
  train_masks_path: "/home/ubuntu/Desktop/new_model/tao-experiment/data/isbi/masks/train"
  val_images_path: "/home/ubuntu/Desktop/new_model/tao-experiment/data/isbi/images/val"
  val_masks_path: "/home/ubuntu/Desktop/new_model/tao-experiment/data/isbi/masks/val"
  test_images_path: "/home/ubuntu/Desktop/new_model/tao-experiment/data/isbi/images/test"
  data_class_config {
    target_classes {
      name: "foreground"
      mapping_class: "foreground"
    }
    target_classes {
      name: "background"
      label_id: 1
      mapping_class: "background"
    }
  }
  augmentation_config {
    spatial_augmentation {
      hflip_probability: 0.5
      vflip_probability: 0.5
      crop_and_resize_prob: 0.5
    }
    brightness_augmentation {
      delta: 0.20000000298023224
    }
  }
}
model_config {
  num_layers: 18
  training_precision {
    backend_floatx: FLOAT32
  }
  arch: "resnet"
  all_projections: true
  model_input_height: 320
  model_input_width: 320
  model_input_channels: 1
}
training_config {
  batch_size: 3
  regularizer {
    type: L2
    weight: 1.9999999494757503e-05
  }
  optimizer {
    adam {
      epsilon: 9.99999993922529e-09
      beta1: 0.8999999761581421
      beta2: 0.9990000128746033
    }
  }
  checkpoint_interval: 1
  log_summary_steps: 10
  learning_rate: 9.999999747378752e-05
  loss: "cross_dice_sum"
  epochs: 50
}
, 'seed': 42, 'benchmark': False, 'temp_dir': '/tmp/tmpy3xdbl94', 'num_classes': 2, 'start_step': 0, 'checkpoint_interval': 1, 'model_json': None, 'load_graph': False, 'phase': None}
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_1 (InputLayer)            (None, 1, 320, 320)  0                                            
__________________________________________________________________________________________________
conv1 (Conv2D)                  (None, 64, 160, 160) 3200        input_1[0][0]                    
__________________________________________________________________________________________________
activation_1 (Activation)       (None, 64, 160, 160) 0           conv1[0][0]                      
__________________________________________________________________________________________________
block_1a_conv_1 (Conv2D)        (None, 64, 80, 80)   36928       activation_1[0][0]               
__________________________________________________________________________________________________
block_1a_relu_1 (Activation)    (None, 64, 80, 80)   0           block_1a_conv_1[0][0]            
__________________________________________________________________________________________________
block_1a_conv_2 (Conv2D)        (None, 64, 80, 80)   36928       block_1a_relu_1[0][0]            
__________________________________________________________________________________________________
block_1a_conv_shortcut (Conv2D) (None, 64, 80, 80)   4160        activation_1[0][0]               
__________________________________________________________________________________________________
add_1 (Add)                     (None, 64, 80, 80)   0           block_1a_conv_2[0][0]            
                                                                 block_1a_conv_shortcut[0][0]     
__________________________________________________________________________________________________
block_1a_relu (Activation)      (None, 64, 80, 80)   0           add_1[0][0]                      
__________________________________________________________________________________________________
block_1b_conv_1 (Conv2D)        (None, 64, 80, 80)   36928       block_1a_relu[0][0]              
__________________________________________________________________________________________________
block_1b_relu_1 (Activation)    (None, 64, 80, 80)   0           block_1b_conv_1[0][0]            
__________________________________________________________________________________________________
block_1b_conv_2 (Conv2D)        (None, 64, 80, 80)   36928       block_1b_relu_1[0][0]            
__________________________________________________________________________________________________
block_1b_conv_shortcut (Conv2D) (None, 64, 80, 80)   4160        block_1a_relu[0][0]              
__________________________________________________________________________________________________
add_2 (Add)                     (None, 64, 80, 80)   0           block_1b_conv_2[0][0]            
                                                                 block_1b_conv_shortcut[0][0]     
__________________________________________________________________________________________________
block_1b_relu (Activation)      (None, 64, 80, 80)   0           add_2[0][0]                      
__________________________________________________________________________________________________
block_2a_conv_1 (Conv2D)        (None, 128, 40, 40)  73856       block_1b_relu[0][0]              
__________________________________________________________________________________________________
block_2a_relu_1 (Activation)    (None, 128, 40, 40)  0           block_2a_conv_1[0][0]            
__________________________________________________________________________________________________
block_2a_conv_2 (Conv2D)        (None, 128, 40, 40)  147584      block_2a_relu_1[0][0]            
__________________________________________________________________________________________________
block_2a_conv_shortcut (Conv2D) (None, 128, 40, 40)  8320        block_1b_relu[0][0]              
__________________________________________________________________________________________________
add_3 (Add)                     (None, 128, 40, 40)  0           block_2a_conv_2[0][0]            
                                                                 block_2a_conv_shortcut[0][0]     
__________________________________________________________________________________________________
block_2a_relu (Activation)      (None, 128, 40, 40)  0           add_3[0][0]                      
__________________________________________________________________________________________________
block_2b_conv_1 (Conv2D)        (None, 128, 40, 40)  147584      block_2a_relu[0][0]              
__________________________________________________________________________________________________
block_2b_relu_1 (Activation)    (None, 128, 40, 40)  0           block_2b_conv_1[0][0]            
__________________________________________________________________________________________________
block_2b_conv_2 (Conv2D)        (None, 128, 40, 40)  147584      block_2b_relu_1[0][0]            
__________________________________________________________________________________________________
block_2b_conv_shortcut (Conv2D) (None, 128, 40, 40)  16512       block_2a_relu[0][0]              
__________________________________________________________________________________________________
add_4 (Add)                     (None, 128, 40, 40)  0           block_2b_conv_2[0][0]            
                                                                 block_2b_conv_shortcut[0][0]     
__________________________________________________________________________________________________
block_2b_relu (Activation)      (None, 128, 40, 40)  0           add_4[0][0]                      
__________________________________________________________________________________________________
block_3a_conv_1 (Conv2D)        (None, 256, 20, 20)  295168      block_2b_relu[0][0]              
__________________________________________________________________________________________________
block_3a_relu_1 (Activation)    (None, 256, 20, 20)  0           block_3a_conv_1[0][0]            
__________________________________________________________________________________________________
block_3a_conv_2 (Conv2D)        (None, 256, 20, 20)  590080      block_3a_relu_1[0][0]            
__________________________________________________________________________________________________
block_3a_conv_shortcut (Conv2D) (None, 256, 20, 20)  33024       block_2b_relu[0][0]              
__________________________________________________________________________________________________
add_5 (Add)                     (None, 256, 20, 20)  0           block_3a_conv_2[0][0]            
                                                                 block_3a_conv_shortcut[0][0]     
__________________________________________________________________________________________________
block_3a_relu (Activation)      (None, 256, 20, 20)  0           add_5[0][0]                      
__________________________________________________________________________________________________
block_3b_conv_1 (Conv2D)        (None, 256, 20, 20)  590080      block_3a_relu[0][0]              
__________________________________________________________________________________________________
block_3b_relu_1 (Activation)    (None, 256, 20, 20)  0           block_3b_conv_1[0][0]            
__________________________________________________________________________________________________
block_3b_conv_2 (Conv2D)        (None, 256, 20, 20)  590080      block_3b_relu_1[0][0]            
__________________________________________________________________________________________________
block_3b_conv_shortcut (Conv2D) (None, 256, 20, 20)  65792       block_3a_relu[0][0]              
__________________________________________________________________________________________________
add_6 (Add)                     (None, 256, 20, 20)  0           block_3b_conv_2[0][0]            
                                                                 block_3b_conv_shortcut[0][0]     
__________________________________________________________________________________________________
block_3b_relu (Activation)      (None, 256, 20, 20)  0           add_6[0][0]                      
__________________________________________________________________________________________________
block_4a_conv_1 (Conv2D)        (None, 512, 20, 20)  1180160     block_3b_relu[0][0]              
__________________________________________________________________________________________________
block_4a_relu_1 (Activation)    (None, 512, 20, 20)  0           block_4a_conv_1[0][0]            
__________________________________________________________________________________________________
block_4a_conv_2 (Conv2D)        (None, 512, 20, 20)  2359808     block_4a_relu_1[0][0]            
__________________________________________________________________________________________________
block_4a_conv_shortcut (Conv2D) (None, 512, 20, 20)  131584      block_3b_relu[0][0]              
__________________________________________________________________________________________________
add_7 (Add)                     (None, 512, 20, 20)  0           block_4a_conv_2[0][0]            
                                                                 block_4a_conv_shortcut[0][0]     
__________________________________________________________________________________________________
block_4a_relu (Activation)      (None, 512, 20, 20)  0           add_7[0][0]                      
__________________________________________________________________________________________________
block_4b_conv_1 (Conv2D)        (None, 512, 20, 20)  2359808     block_4a_relu[0][0]              
__________________________________________________________________________________________________
block_4b_relu_1 (Activation)    (None, 512, 20, 20)  0           block_4b_conv_1[0][0]            
__________________________________________________________________________________________________
block_4b_conv_2 (Conv2D)        (None, 512, 20, 20)  2359808     block_4b_relu_1[0][0]            
__________________________________________________________________________________________________
block_4b_conv_shortcut (Conv2D) (None, 512, 20, 20)  262656      block_4a_relu[0][0]              
__________________________________________________________________________________________________
add_8 (Add)                     (None, 512, 20, 20)  0           block_4b_conv_2[0][0]            
                                                                 block_4b_conv_shortcut[0][0]     
__________________________________________________________________________________________________
block_4b_relu (Activation)      (None, 512, 20, 20)  0           add_8[0][0]                      
__________________________________________________________________________________________________
conv2d_transpose_1 (Conv2DTrans (None, 256, 40, 40)  2097408     block_4b_relu[0][0]              
__________________________________________________________________________________________________
concatenate_1 (Concatenate)     (None, 384, 40, 40)  0           conv2d_transpose_1[0][0]         
                                                                 block_2b_relu[0][0]              
__________________________________________________________________________________________________
activation_2 (Activation)       (None, 384, 40, 40)  0           concatenate_1[0][0]              
__________________________________________________________________________________________________
conv2d_1 (Conv2D)               (None, 256, 40, 40)  884992      activation_2[0][0]               
__________________________________________________________________________________________________
activation_3 (Activation)       (None, 256, 40, 40)  0           conv2d_1[0][0]                   
__________________________________________________________________________________________________
conv2d_transpose_2 (Conv2DTrans (None, 128, 80, 80)  524416      activation_3[0][0]               
__________________________________________________________________________________________________
concatenate_2 (Concatenate)     (None, 192, 80, 80)  0           conv2d_transpose_2[0][0]         
                                                                 block_1b_relu[0][0]              
__________________________________________________________________________________________________
activation_4 (Activation)       (None, 192, 80, 80)  0           concatenate_2[0][0]              
__________________________________________________________________________________________________
conv2d_2 (Conv2D)               (None, 128, 80, 80)  221312      activation_4[0][0]               
__________________________________________________________________________________________________
activation_5 (Activation)       (None, 128, 80, 80)  0           conv2d_2[0][0]                   
__________________________________________________________________________________________________
conv2d_transpose_3 (Conv2DTrans (None, 64, 160, 160) 131136      activation_5[0][0]               
__________________________________________________________________________________________________
concatenate_3 (Concatenate)     (None, 128, 160, 160 0           conv2d_transpose_3[0][0]         
                                                                 activation_1[0][0]               
__________________________________________________________________________________________________
activation_6 (Activation)       (None, 128, 160, 160 0           concatenate_3[0][0]              
__________________________________________________________________________________________________
conv2d_3 (Conv2D)               (None, 64, 160, 160) 73792       activation_6[0][0]               
__________________________________________________________________________________________________
activation_7 (Activation)       (None, 64, 160, 160) 0           conv2d_3[0][0]                   
__________________________________________________________________________________________________
conv2d_transpose_4 (Conv2DTrans (None, 64, 320, 320) 65600       activation_7[0][0]               
__________________________________________________________________________________________________
activation_8 (Activation)       (None, 64, 320, 320) 0           conv2d_transpose_4[0][0]         
__________________________________________________________________________________________________
conv2d_4 (Conv2D)               (None, 64, 320, 320) 36928       activation_8[0][0]               
__________________________________________________________________________________________________
activation_9 (Activation)       (None, 64, 320, 320) 0           conv2d_4[0][0]                   
__________________________________________________________________________________________________
conv2d_5 (Conv2D)               (None, 2, 320, 320)  1154        activation_9[0][0]               
==================================================================================================
Total params: 15,555,458
Trainable params: 15,555,458
Non-trainable params: 0
__________________________________________________________________________________________________
2022-07-14 14:31:11,315 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.

Hi,

This looks like a TAO Toolkit related issue. We will move this post to the TAO Toolkit forum.

Thanks!

1 Like

Could you please run below to narrow down?
$ tao unet run /bin/bash
then inside the docker,
# unet train xxx

There is no update from you for a period, assuming this is not an issue anymore.
Hence we are closing this topic. If need further support, please open a new one.
Thanks