For multi-GPU, change --gpus based on your machine. 2022-04-05 01:16:03,349 [INFO] root: Registry: ['nvcr.io'] 2022-04-05 01:16:03,492 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit-tf:v3.21.11-tf1.15.5-py3 Matplotlib created a temporary config/cache directory at /tmp/matplotlib-pziuxtzd because the default path (/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing. Using TensorFlow backend. WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them. Using TensorFlow backend. WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/unet/hooks/checkpoint_saver_hook.py:21: The name tf.train.CheckpointSaverHook is deprecated. Please use tf.estimator.CheckpointSaverHook instead. WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/unet/hooks/pretrained_restore_hook.py:23: The name tf.logging.set_verbosity is deprecated. Please use tf.compat.v1.logging.set_verbosity instead. WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/unet/hooks/pretrained_restore_hook.py:23: The name tf.logging.WARN is deprecated. Please use tf.compat.v1.logging.WARN instead. WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/unet/scripts/train.py:410: The name tf.logging.INFO is deprecated. Please use tf.compat.v1.logging.INFO instead. Loading experiment spec at /workspace/tao-experiments/specs/unet_train_resnet_unet_6S.txt. 2022-04-04 22:16:10,018 [INFO] __main__: Loading experiment spec at /workspace/tao-experiments/specs/unet_train_resnet_unet_6S.txt. 2022-04-04 22:16:10,020 [INFO] iva.unet.spec_handler.spec_loader: Merging specification from /workspace/tao-experiments/specs/unet_train_resnet_unet_6S.txt 2022-04-04 22:16:10,022 [INFO] root: Initializing the pre-trained weights from /workspace/tao-experiments//pretrained_resnet18/pretrained_semantic_segmentation_vresnet18/resnet_18.hdf5 WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:517: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead. 2022-04-04 22:16:10,024 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:517: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead. WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4138: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead. 2022-04-04 22:16:10,034 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4138: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead. WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:245: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead. 2022-04-04 22:16:10,044 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:245: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead. WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:1834: The name tf.nn.fused_batch_norm is deprecated. Please use tf.compat.v1.nn.fused_batch_norm instead. 2022-04-04 22:16:10,052 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:1834: The name tf.nn.fused_batch_norm is deprecated. Please use tf.compat.v1.nn.fused_batch_norm instead. WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:133: The name tf.placeholder_with_default is deprecated. Please use tf.compat.v1.placeholder_with_default instead. 2022-04-04 22:16:10,056 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:133: The name tf.placeholder_with_default is deprecated. Please use tf.compat.v1.placeholder_with_default instead. WARNING:tensorflow:From /opt/nvidia/third_party/keras/tensorflow_backend.py:187: The name tf.nn.avg_pool is deprecated. Please use tf.nn.avg_pool2d instead. 2022-04-04 22:16:10,709 [WARNING] tensorflow: From /opt/nvidia/third_party/keras/tensorflow_backend.py:187: The name tf.nn.avg_pool is deprecated. Please use tf.nn.avg_pool2d instead. WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:174: The name tf.get_default_session is deprecated. Please use tf.compat.v1.get_default_session instead. 2022-04-04 22:16:10,879 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:174: The name tf.get_default_session is deprecated. Please use tf.compat.v1.get_default_session instead. WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:190: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead. 2022-04-04 22:16:10,879 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:190: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead. WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:199: The name tf.is_variable_initialized is deprecated. Please use tf.compat.v1.is_variable_initialized instead. 2022-04-04 22:16:10,879 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:199: The name tf.is_variable_initialized is deprecated. Please use tf.compat.v1.is_variable_initialized instead. WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:206: The name tf.variables_initializer is deprecated. Please use tf.compat.v1.variables_initializer instead. 2022-04-04 22:16:11,107 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:206: The name tf.variables_initializer is deprecated. Please use tf.compat.v1.variables_initializer instead. WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:95: The name tf.reset_default_graph is deprecated. Please use tf.compat.v1.reset_default_graph instead. 2022-04-04 22:16:11,594 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:95: The name tf.reset_default_graph is deprecated. Please use tf.compat.v1.reset_default_graph instead. 2022-04-04 22:16:11,613 [INFO] iva.unet.model.utilities: Label Id 0: Train Id 0 2022-04-04 22:16:11,613 [INFO] iva.unet.model.utilities: Label Id 1: Train Id 1 2022-04-04 22:16:11,613 [INFO] iva.unet.model.utilities: Label Id 2: Train Id 2 INFO:tensorflow:Using config: {'_model_dir': '/workspace/tao-experiments//unpruned', '_tf_random_seed': None, '_save_summary_steps': 5, '_save_checkpoints_steps': None, '_save_checkpoints_secs': None, '_session_config': intra_op_parallelism_threads: 1 inter_op_parallelism_threads: 38 gpu_options { allow_growth: true visible_device_list: "0" force_gpu_compatible: true } allow_soft_placement: true , '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': None, '_log_step_count_steps': None, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': , '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1} 2022-04-04 22:16:11,614 [INFO] tensorflow: Using config: {'_model_dir': '/workspace/tao-experiments//unpruned', '_tf_random_seed': None, '_save_summary_steps': 5, '_save_checkpoints_steps': None, '_save_checkpoints_secs': None, '_session_config': intra_op_parallelism_threads: 1 inter_op_parallelism_threads: 38 gpu_options { allow_growth: true visible_device_list: "0" force_gpu_compatible: true } allow_soft_placement: true , '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': None, '_log_step_count_steps': None, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': , '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1} Phase train: Total 40 files. 2022-04-04 22:16:11,627 [INFO] iva.unet.model.utilities: The total number of training samples 40 and the batch size per GPU 3 2022-04-04 22:16:11,627 [INFO] iva.unet.model.utilities: Cannot iterate over exactly 40 samples with a batch size of 3; each epoch will therefore take one extra step. 2022-04-04 22:16:11,627 [INFO] iva.unet.model.utilities: Steps per epoch taken: 14 Running for 50 Epochs 2022-04-04 22:16:11,627 [INFO] __main__: Running for 50 Epochs INFO:tensorflow:Create CheckpointSaverHook. 2022-04-04 22:16:11,627 [INFO] tensorflow: Create CheckpointSaverHook. WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/autograph/converters/directives.py:119: The name tf.set_random_seed is deprecated. Please use tf.compat.v1.set_random_seed instead. 2022-04-04 22:16:11,968 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/autograph/converters/directives.py:119: The name tf.set_random_seed is deprecated. Please use tf.compat.v1.set_random_seed instead. WARNING:tensorflow:Entity > could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of >. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code 2022-04-04 22:16:12,016 [WARNING] tensorflow: Entity > could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of >. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code WARNING:tensorflow:Entity . at 0x7fba5b081ea0> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of . at 0x7fba5b081ea0>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code 2022-04-04 22:16:12,031 [WARNING] tensorflow: Entity . at 0x7fba5b081ea0> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of . at 0x7fba5b081ea0>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code WARNING:tensorflow: The TensorFlow contrib module will not be included in TensorFlow 2.0. For more information, please see: * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md * https://github.com/tensorflow/addons * https://github.com/tensorflow/io (for I/O related ops) If you depend on functionality not listed there, please file an issue. 2022-04-04 22:16:12,033 [WARNING] tensorflow: The TensorFlow contrib module will not be included in TensorFlow 2.0. For more information, please see: * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md * https://github.com/tensorflow/addons * https://github.com/tensorflow/io (for I/O related ops) If you depend on functionality not listed there, please file an issue. /opt/nvidia/third_party/keras/tensorflow_backend.py:356: UserWarning: Creating resources inside a function passed to Dataset.map() is not supported. Create each resource outside the function, and capture it inside the function to use it. self, _map_func_set_random_wrapper, num_parallel_calls=num_parallel_calls WARNING:tensorflow:Entity > could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of >. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code 2022-04-04 22:16:12,042 [WARNING] tensorflow: Entity > could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of >. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code WARNING:tensorflow:Entity > could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of >. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code 2022-04-04 22:16:12,048 [WARNING] tensorflow: Entity > could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of >. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code WARNING:tensorflow:Entity > could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of >. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code 2022-04-04 22:16:12,054 [WARNING] tensorflow: Entity > could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of >. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/unet/utils/data_loader.py:451: The name tf.image.resize_images is deprecated. Please use tf.image.resize instead. 2022-04-04 22:16:12,055 [WARNING] tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/unet/utils/data_loader.py:451: The name tf.image.resize_images is deprecated. Please use tf.image.resize instead. WARNING:tensorflow:Entity . at 0x7fb898041048> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of . at 0x7fb898041048>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code 2022-04-04 22:16:12,064 [WARNING] tensorflow: Entity . at 0x7fb898041048> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of . at 0x7fb898041048>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code WARNING:tensorflow:Entity . at 0x7fb8980417b8> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of . at 0x7fb8980417b8>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code 2022-04-04 22:16:12,072 [WARNING] tensorflow: Entity . at 0x7fb8980417b8> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of . at 0x7fb8980417b8>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code WARNING:tensorflow:The operation `tf.image.convert_image_dtype` will be skipped since the input and output dtypes are identical. 2022-04-04 22:16:12,131 [WARNING] tensorflow: The operation `tf.image.convert_image_dtype` will be skipped since the input and output dtypes are identical. WARNING:tensorflow:Entity . at 0x7fb8980419d8> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of . at 0x7fb8980419d8>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code 2022-04-04 22:16:12,142 [WARNING] tensorflow: Entity . at 0x7fb8980419d8> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of . at 0x7fb8980419d8>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code WARNING:tensorflow:Entity > could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of >. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code 2022-04-04 22:16:12,148 [WARNING] tensorflow: Entity > could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of >. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code WARNING:tensorflow:Entity . at 0x7fb898041d90> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of . at 0x7fb898041d90>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code 2022-04-04 22:16:12,155 [WARNING] tensorflow: Entity . at 0x7fb898041d90> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of . at 0x7fb898041d90>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code INFO:tensorflow:Calling model_fn. 2022-04-04 22:16:12,171 [INFO] tensorflow: Calling model_fn. 2022-04-04 22:16:12,171 [INFO] iva.unet.utils.model_fn: {'exec_mode': 'train', 'model_dir': '/workspace/tao-experiments//unpruned', 'resize_padding': False, 'resize_method': 'BILINEAR', 'log_dir': None, 'batch_size': 3, 'learning_rate': 9.999999747378752e-05, 'activation': 'softmax', 'crossvalidation_idx': None, 'max_steps': None, 'regularizer_type': 2, 'weight_decay': 1.9999999949504854e-06, 'log_summary_steps': 10, 'warmup_steps': 0, 'augment': True, 'use_amp': False, 'use_trt': False, 'use_xla': False, 'loss': 'cross_entropy', 'epochs': 50, 'pretrained_weights_file': None, 'lr_scheduler': None, 'unet_model': , 'key': 'nvidia_tlt', 'experiment_spec': random_seed: 42 dataset_config { augment: true dataset: "custom" input_image_type: "color" train_images_path: "/workspace/tao-experiments/data/images/train" train_masks_path: "/workspace/tao-experiments/data/masks/train" val_images_path: "/workspace/tao-experiments/data/images/val" val_masks_path: "/workspace/tao-experiments/data/masks/val" test_images_path: "/workspace/tao-experiments/data/images/test" data_class_config { target_classes { name: "Background" mapping_class: "Background" } target_classes { name: "Plant" label_id: 1 mapping_class: "Plant" } target_classes { name: "Leaf" label_id: 2 mapping_class: "Leaf" } } augmentation_config { spatial_augmentation { hflip_probability: 0.5 vflip_probability: 0.5 crop_and_resize_prob: 0.009999999776482582 } } } model_config { num_layers: 18 training_precision { backend_floatx: FLOAT32 } arch: "resnet" all_projections: true model_input_height: 512 model_input_width: 512 model_input_channels: 3 } training_config { batch_size: 3 regularizer { type: L2 weight: 1.9999999949504854e-06 } optimizer { adam { epsilon: 9.99999993922529e-09 beta1: 0.8999999761581421 beta2: 0.9990000128746033 } } checkpoint_interval: 1 log_summary_steps: 10 learning_rate: 9.999999747378752e-05 loss: "cross_entropy" epochs: 50 } , 'seed': 42, 'benchmark': False, 'temp_dir': '/tmp/tmp24mbzrsu', 'num_classes': 3, 'num_conf_mat_classes': 3, 'start_step': 0, 'checkpoint_interval': 1, 'model_json': None, 'load_graph': False, 'weights_monitor': False, 'phase': None} __________________________________________________________________________________________________ Layer (type) Output Shape Param # Connected to ================================================================================================== input_1 (InputLayer) (None, 3, 512, 512) 0 __________________________________________________________________________________________________ conv1 (Conv2D) (None, 64, 256, 256) 9472 input_1[0][0] __________________________________________________________________________________________________ activation_1 (Activation) (None, 64, 256, 256) 0 conv1[0][0] __________________________________________________________________________________________________ block_1a_conv_1 (Conv2D) (None, 64, 128, 128) 36928 activation_1[0][0] __________________________________________________________________________________________________ block_1a_relu_1 (Activation) (None, 64, 128, 128) 0 block_1a_conv_1[0][0] __________________________________________________________________________________________________ block_1a_conv_2 (Conv2D) (None, 64, 128, 128) 36928 block_1a_relu_1[0][0] __________________________________________________________________________________________________ block_1a_conv_shortcut (Conv2D) (None, 64, 128, 128) 4160 activation_1[0][0] __________________________________________________________________________________________________ add_1 (Add) (None, 64, 128, 128) 0 block_1a_conv_2[0][0] block_1a_conv_shortcut[0][0] __________________________________________________________________________________________________ block_1a_relu (Activation) (None, 64, 128, 128) 0 add_1[0][0] __________________________________________________________________________________________________ block_1b_conv_1 (Conv2D) (None, 64, 128, 128) 36928 block_1a_relu[0][0] __________________________________________________________________________________________________ block_1b_relu_1 (Activation) (None, 64, 128, 128) 0 block_1b_conv_1[0][0] __________________________________________________________________________________________________ block_1b_conv_2 (Conv2D) (None, 64, 128, 128) 36928 block_1b_relu_1[0][0] __________________________________________________________________________________________________ block_1b_conv_shortcut (Conv2D) (None, 64, 128, 128) 4160 block_1a_relu[0][0] __________________________________________________________________________________________________ add_2 (Add) (None, 64, 128, 128) 0 block_1b_conv_2[0][0] block_1b_conv_shortcut[0][0] __________________________________________________________________________________________________ block_1b_relu (Activation) (None, 64, 128, 128) 0 add_2[0][0] __________________________________________________________________________________________________ block_2a_conv_1 (Conv2D) (None, 128, 64, 64) 73856 block_1b_relu[0][0] __________________________________________________________________________________________________ block_2a_relu_1 (Activation) (None, 128, 64, 64) 0 block_2a_conv_1[0][0] __________________________________________________________________________________________________ block_2a_conv_2 (Conv2D) (None, 128, 64, 64) 147584 block_2a_relu_1[0][0] __________________________________________________________________________________________________ block_2a_conv_shortcut (Conv2D) (None, 128, 64, 64) 8320 block_1b_relu[0][0] __________________________________________________________________________________________________ add_3 (Add) (None, 128, 64, 64) 0 block_2a_conv_2[0][0] block_2a_conv_shortcut[0][0] __________________________________________________________________________________________________ block_2a_relu (Activation) (None, 128, 64, 64) 0 add_3[0][0] __________________________________________________________________________________________________ block_2b_conv_1 (Conv2D) (None, 128, 64, 64) 147584 block_2a_relu[0][0] __________________________________________________________________________________________________ block_2b_relu_1 (Activation) (None, 128, 64, 64) 0 block_2b_conv_1[0][0] __________________________________________________________________________________________________ block_2b_conv_2 (Conv2D) (None, 128, 64, 64) 147584 block_2b_relu_1[0][0] __________________________________________________________________________________________________ block_2b_conv_shortcut (Conv2D) (None, 128, 64, 64) 16512 block_2a_relu[0][0] __________________________________________________________________________________________________ add_4 (Add) (None, 128, 64, 64) 0 block_2b_conv_2[0][0] block_2b_conv_shortcut[0][0] __________________________________________________________________________________________________ block_2b_relu (Activation) (None, 128, 64, 64) 0 add_4[0][0] __________________________________________________________________________________________________ block_3a_conv_1 (Conv2D) (None, 256, 32, 32) 295168 block_2b_relu[0][0] __________________________________________________________________________________________________ block_3a_relu_1 (Activation) (None, 256, 32, 32) 0 block_3a_conv_1[0][0] __________________________________________________________________________________________________ block_3a_conv_2 (Conv2D) (None, 256, 32, 32) 590080 block_3a_relu_1[0][0] __________________________________________________________________________________________________ block_3a_conv_shortcut (Conv2D) (None, 256, 32, 32) 33024 block_2b_relu[0][0] __________________________________________________________________________________________________ add_5 (Add) (None, 256, 32, 32) 0 block_3a_conv_2[0][0] block_3a_conv_shortcut[0][0] __________________________________________________________________________________________________ block_3a_relu (Activation) (None, 256, 32, 32) 0 add_5[0][0] __________________________________________________________________________________________________ block_3b_conv_1 (Conv2D) (None, 256, 32, 32) 590080 block_3a_relu[0][0] __________________________________________________________________________________________________ block_3b_relu_1 (Activation) (None, 256, 32, 32) 0 block_3b_conv_1[0][0] __________________________________________________________________________________________________ block_3b_conv_2 (Conv2D) (None, 256, 32, 32) 590080 block_3b_relu_1[0][0] __________________________________________________________________________________________________ block_3b_conv_shortcut (Conv2D) (None, 256, 32, 32) 65792 block_3a_relu[0][0] __________________________________________________________________________________________________ add_6 (Add) (None, 256, 32, 32) 0 block_3b_conv_2[0][0] block_3b_conv_shortcut[0][0] __________________________________________________________________________________________________ block_3b_relu (Activation) (None, 256, 32, 32) 0 add_6[0][0] __________________________________________________________________________________________________ block_4a_conv_1 (Conv2D) (None, 512, 32, 32) 1180160 block_3b_relu[0][0] __________________________________________________________________________________________________ block_4a_relu_1 (Activation) (None, 512, 32, 32) 0 block_4a_conv_1[0][0] __________________________________________________________________________________________________ block_4a_conv_2 (Conv2D) (None, 512, 32, 32) 2359808 block_4a_relu_1[0][0] __________________________________________________________________________________________________ block_4a_conv_shortcut (Conv2D) (None, 512, 32, 32) 131584 block_3b_relu[0][0] __________________________________________________________________________________________________ add_7 (Add) (None, 512, 32, 32) 0 block_4a_conv_2[0][0] block_4a_conv_shortcut[0][0] __________________________________________________________________________________________________ block_4a_relu (Activation) (None, 512, 32, 32) 0 add_7[0][0] __________________________________________________________________________________________________ block_4b_conv_1 (Conv2D) (None, 512, 32, 32) 2359808 block_4a_relu[0][0] __________________________________________________________________________________________________ block_4b_relu_1 (Activation) (None, 512, 32, 32) 0 block_4b_conv_1[0][0] __________________________________________________________________________________________________ block_4b_conv_2 (Conv2D) (None, 512, 32, 32) 2359808 block_4b_relu_1[0][0] __________________________________________________________________________________________________ block_4b_conv_shortcut (Conv2D) (None, 512, 32, 32) 262656 block_4a_relu[0][0] __________________________________________________________________________________________________ add_8 (Add) (None, 512, 32, 32) 0 block_4b_conv_2[0][0] block_4b_conv_shortcut[0][0] __________________________________________________________________________________________________ block_4b_relu (Activation) (None, 512, 32, 32) 0 add_8[0][0] __________________________________________________________________________________________________ conv2d_transpose_1 (Conv2DTrans (None, 256, 64, 64) 2097408 block_4b_relu[0][0] __________________________________________________________________________________________________ concatenate_1 (Concatenate) (None, 384, 64, 64) 0 conv2d_transpose_1[0][0] block_2b_relu[0][0] __________________________________________________________________________________________________ activation_2 (Activation) (None, 384, 64, 64) 0 concatenate_1[0][0] __________________________________________________________________________________________________ conv2d_1 (Conv2D) (None, 256, 64, 64) 884992 activation_2[0][0] __________________________________________________________________________________________________ activation_3 (Activation) (None, 256, 64, 64) 0 conv2d_1[0][0] __________________________________________________________________________________________________ conv2d_transpose_2 (Conv2DTrans (None, 128, 128, 128 524416 activation_3[0][0] __________________________________________________________________________________________________ concatenate_2 (Concatenate) (None, 192, 128, 128 0 conv2d_transpose_2[0][0] block_1b_relu[0][0] __________________________________________________________________________________________________ activation_4 (Activation) (None, 192, 128, 128 0 concatenate_2[0][0] __________________________________________________________________________________________________ conv2d_2 (Conv2D) (None, 128, 128, 128 221312 activation_4[0][0] __________________________________________________________________________________________________ activation_5 (Activation) (None, 128, 128, 128 0 conv2d_2[0][0] __________________________________________________________________________________________________ conv2d_transpose_3 (Conv2DTrans (None, 64, 256, 256) 131136 activation_5[0][0] __________________________________________________________________________________________________ concatenate_3 (Concatenate) (None, 128, 256, 256 0 conv2d_transpose_3[0][0] activation_1[0][0] __________________________________________________________________________________________________ activation_6 (Activation) (None, 128, 256, 256 0 concatenate_3[0][0] __________________________________________________________________________________________________ conv2d_3 (Conv2D) (None, 64, 256, 256) 73792 activation_6[0][0] __________________________________________________________________________________________________ activation_7 (Activation) (None, 64, 256, 256) 0 conv2d_3[0][0] __________________________________________________________________________________________________ conv2d_transpose_4 (Conv2DTrans (None, 64, 512, 512) 65600 activation_7[0][0] __________________________________________________________________________________________________ activation_8 (Activation) (None, 64, 512, 512) 0 conv2d_transpose_4[0][0] __________________________________________________________________________________________________ conv2d_4 (Conv2D) (None, 64, 512, 512) 36928 activation_8[0][0] __________________________________________________________________________________________________ activation_9 (Activation) (None, 64, 512, 512) 0 conv2d_4[0][0] __________________________________________________________________________________________________ conv2d_5 (Conv2D) (None, 3, 512, 512) 1731 activation_9[0][0] ================================================================================================== Total params: 15,562,307 Trainable params: 15,562,307 Non-trainable params: 0 __________________________________________________________________________________________________ WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/unet/utils/model_fn.py:225: The name tf.summary.scalar is deprecated. Please use tf.compat.v1.summary.scalar instead. 2022-04-04 22:16:12,713 [WARNING] tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/unet/utils/model_fn.py:225: The name tf.summary.scalar is deprecated. Please use tf.compat.v1.summary.scalar instead. INFO:tensorflow:Done calling model_fn. 2022-04-04 22:16:13,801 [INFO] tensorflow: Done calling model_fn. INFO:tensorflow:Graph was finalized. 2022-04-04 22:16:14,617 [INFO] tensorflow: Graph was finalized. INFO:tensorflow:Running local_init_op. 2022-04-04 22:16:15,377 [INFO] tensorflow: Running local_init_op. INFO:tensorflow:Done running local_init_op. 2022-04-04 22:16:15,435 [INFO] tensorflow: Done running local_init_op. [GPU] Restoring pretrained weights from: /tmp/tmpx0mi1abz/model.ckpt-1 2022-04-04 22:16:15,846 [INFO] iva.unet.hooks.pretrained_restore_hook: Pretrained weights loaded with success... INFO:tensorflow:Saving checkpoints for step-0. 2022-04-04 22:16:17,981 [INFO] tensorflow: Saving checkpoints for step-0. WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/unet/hooks/training_hook.py:95: The name tf.train.get_or_create_global_step is deprecated. Please use tf.compat.v1.train.get_or_create_global_step instead. 2022-04-04 22:16:20,536 [WARNING] tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/unet/hooks/training_hook.py:95: The name tf.train.get_or_create_global_step is deprecated. Please use tf.compat.v1.train.get_or_create_global_step instead. Epoch: 0/50:, Cur-Step: 0, loss(cross_entropy): 1.10564, Running average loss:1.10564, Time taken: 0:00:00 ETA: 0:00:00 2022-04-04 22:16:28,602 [INFO] __main__: Epoch: 0/50:, Cur-Step: 0, loss(cross_entropy): 1.10564, Running average loss:1.10564, Time taken: 0:00:00 ETA: 0:00:00 Epoch: 0/50:, Cur-Step: 10, loss(cross_entropy): 1.00712, Running average loss:1.07482, Time taken: 0:00:00 ETA: 0:00:00 2022-04-04 22:16:35,025 [INFO] __main__: Epoch: 0/50:, Cur-Step: 10, loss(cross_entropy): 1.00712, Running average loss:1.07482, Time taken: 0:00:00 ETA: 0:00:00 INFO:tensorflow:Saving checkpoints for step-14. 2022-04-04 22:16:35,328 [INFO] tensorflow: Saving checkpoints for step-14. Epoch: 1/50:, Cur-Step: 20, loss(cross_entropy): 0.37199, Running average loss:0.44709, Time taken: 0:00:23.699786 ETA: 0:19:21.289535 2022-04-04 22:16:38,155 [INFO] __main__: Epoch: 1/50:, Cur-Step: 20, loss(cross_entropy): 0.37199, Running average loss:0.44709, Time taken: 0:00:23.699786 ETA: 0:19:21.289535 INFO:tensorflow:Saving checkpoints for step-28. 2022-04-04 22:16:38,726 [INFO] tensorflow: Saving checkpoints for step-28. Epoch: 2/50:, Cur-Step: 30, loss(cross_entropy): 0.22095, Running average loss:0.28419, Time taken: 0:00:03.498645 ETA: 0:02:47.934963 2022-04-04 22:16:41,257 [INFO] __main__: Epoch: 2/50:, Cur-Step: 30, loss(cross_entropy): 0.22095, Running average loss:0.28419, Time taken: 0:00:03.498645 ETA: 0:02:47.934963 Epoch: 2/50:, Cur-Step: 40, loss(cross_entropy): 0.16499, Running average loss:0.25439, Time taken: 0:00:03.498645 ETA: 0:02:47.934963 2022-04-04 22:16:42,051 [INFO] __main__: Epoch: 2/50:, Cur-Step: 40, loss(cross_entropy): 0.16499, Running average loss:0.25439, Time taken: 0:00:03.498645 ETA: 0:02:47.934963 INFO:tensorflow:Saving checkpoints for step-42. 2022-04-04 22:16:42,122 [INFO] tensorflow: Saving checkpoints for step-42. Epoch: 3/50:, Cur-Step: 50, loss(cross_entropy): 0.27909, Running average loss:0.24967, Time taken: 0:00:03.465019 ETA: 0:02:42.855904 2022-04-04 22:16:45,059 [INFO] __main__: Epoch: 3/50:, Cur-Step: 50, loss(cross_entropy): 0.27909, Running average loss:0.24967, Time taken: 0:00:03.465019 ETA: 0:02:42.855904 INFO:tensorflow:Saving checkpoints for step-56. 2022-04-04 22:16:45,508 [INFO] tensorflow: Saving checkpoints for step-56. Epoch: 4/50:, Cur-Step: 60, loss(cross_entropy): 0.11357, Running average loss:0.18627, Time taken: 0:00:03.455615 ETA: 0:02:38.958281 2022-04-04 22:16:48,185 [INFO] __main__: Epoch: 4/50:, Cur-Step: 60, loss(cross_entropy): 0.11357, Running average loss:0.18627, Time taken: 0:00:03.455615 ETA: 0:02:38.958281 INFO:tensorflow:Saving checkpoints for step-70. 2022-04-04 22:16:48,877 [INFO] tensorflow: Saving checkpoints for step-70. Epoch: 5/50:, Cur-Step: 70, loss(cross_entropy): 0.15270, Running average loss:0.15270, Time taken: 0:00:03.439477 ETA: 0:02:34.776474 2022-04-04 22:16:51,290 [INFO] __main__: Epoch: 5/50:, Cur-Step: 70, loss(cross_entropy): 0.15270, Running average loss:0.15270, Time taken: 0:00:03.439477 ETA: 0:02:34.776474 Epoch: 5/50:, Cur-Step: 80, loss(cross_entropy): 0.09373, Running average loss:0.12793, Time taken: 0:00:03.439477 ETA: 0:02:34.776474 2022-04-04 22:16:52,084 [INFO] __main__: Epoch: 5/50:, Cur-Step: 80, loss(cross_entropy): 0.09373, Running average loss:0.12793, Time taken: 0:00:03.439477 ETA: 0:02:34.776474 INFO:tensorflow:Saving checkpoints for step-84. 2022-04-04 22:16:52,296 [INFO] tensorflow: Saving checkpoints for step-84. Epoch: 6/50:, Cur-Step: 90, loss(cross_entropy): 0.11153, Running average loss:0.13855, Time taken: 0:00:03.488393 ETA: 0:02:33.489274 2022-04-04 22:16:55,255 [INFO] __main__: Epoch: 6/50:, Cur-Step: 90, loss(cross_entropy): 0.11153, Running average loss:0.13855, Time taken: 0:00:03.488393 ETA: 0:02:33.489274 INFO:tensorflow:Saving checkpoints for step-98. 2022-04-04 22:16:55,837 [INFO] tensorflow: Saving checkpoints for step-98. Epoch: 7/50:, Cur-Step: 100, loss(cross_entropy): 0.25211, Running average loss:0.19415, Time taken: 0:00:03.610301 ETA: 0:02:35.242944 2022-04-04 22:16:58,345 [INFO] __main__: Epoch: 7/50:, Cur-Step: 100, loss(cross_entropy): 0.25211, Running average loss:0.19415, Time taken: 0:00:03.610301 ETA: 0:02:35.242944 Epoch: 7/50:, Cur-Step: 110, loss(cross_entropy): 0.11339, Running average loss:0.15657, Time taken: 0:00:03.610301 ETA: 0:02:35.242944 2022-04-04 22:16:59,146 [INFO] __main__: Epoch: 7/50:, Cur-Step: 110, loss(cross_entropy): 0.11339, Running average loss:0.15657, Time taken: 0:00:03.610301 ETA: 0:02:35.242944 INFO:tensorflow:Saving checkpoints for step-112. 2022-04-04 22:16:59,218 [INFO] tensorflow: Saving checkpoints for step-112. Epoch: 8/50:, Cur-Step: 120, loss(cross_entropy): 0.15073, Running average loss:0.14522, Time taken: 0:00:03.450322 ETA: 0:02:24.913530 2022-04-04 22:17:02,253 [INFO] __main__: Epoch: 8/50:, Cur-Step: 120, loss(cross_entropy): 0.15073, Running average loss:0.14522, Time taken: 0:00:03.450322 ETA: 0:02:24.913530 INFO:tensorflow:Saving checkpoints for step-126. 2022-04-04 22:17:02,609 [INFO] tensorflow: Saving checkpoints for step-126. Epoch: 9/50:, Cur-Step: 130, loss(cross_entropy): 0.14156, Running average loss:0.13015, Time taken: 0:00:03.461765 ETA: 0:02:21.932367 2022-04-04 22:17:05,270 [INFO] __main__: Epoch: 9/50:, Cur-Step: 130, loss(cross_entropy): 0.14156, Running average loss:0.13015, Time taken: 0:00:03.461765 ETA: 0:02:21.932367 INFO:tensorflow:Saving checkpoints for step-140. 2022-04-04 22:17:05,972 [INFO] tensorflow: Saving checkpoints for step-140. Epoch: 10/50:, Cur-Step: 140, loss(cross_entropy): 0.14323, Running average loss:0.14323, Time taken: 0:00:03.435190 ETA: 0:02:17.407598 2022-04-04 22:17:08,331 [INFO] __main__: Epoch: 10/50:, Cur-Step: 140, loss(cross_entropy): 0.14323, Running average loss:0.14323, Time taken: 0:00:03.435190 ETA: 0:02:17.407598 Epoch: 10/50:, Cur-Step: 150, loss(cross_entropy): 0.10545, Running average loss:0.12916, Time taken: 0:00:03.435190 ETA: 0:02:17.407598 2022-04-04 22:17:09,116 [INFO] __main__: Epoch: 10/50:, Cur-Step: 150, loss(cross_entropy): 0.10545, Running average loss:0.12916, Time taken: 0:00:03.435190 ETA: 0:02:17.407598 INFO:tensorflow:Saving checkpoints for step-154. 2022-04-04 22:17:09,331 [INFO] tensorflow: Saving checkpoints for step-154. Epoch: 11/50:, Cur-Step: 160, loss(cross_entropy): 0.19405, Running average loss:0.17304, Time taken: 0:00:03.428074 ETA: 0:02:13.694900 2022-04-04 22:17:12,384 [INFO] __main__: Epoch: 11/50:, Cur-Step: 160, loss(cross_entropy): 0.19405, Running average loss:0.17304, Time taken: 0:00:03.428074 ETA: 0:02:13.694900 INFO:tensorflow:Saving checkpoints for step-168. 2022-04-04 22:17:12,879 [INFO] tensorflow: Saving checkpoints for step-168. Epoch: 12/50:, Cur-Step: 170, loss(cross_entropy): 0.12577, Running average loss:0.13361, Time taken: 0:00:03.617774 ETA: 0:02:17.475430 2022-04-04 22:17:15,392 [INFO] __main__: Epoch: 12/50:, Cur-Step: 170, loss(cross_entropy): 0.12577, Running average loss:0.13361, Time taken: 0:00:03.617774 ETA: 0:02:17.475430 Epoch: 12/50:, Cur-Step: 180, loss(cross_entropy): 0.09080, Running average loss:0.14973, Time taken: 0:00:03.617774 ETA: 0:02:17.475430 2022-04-04 22:17:16,191 [INFO] __main__: Epoch: 12/50:, Cur-Step: 180, loss(cross_entropy): 0.09080, Running average loss:0.14973, Time taken: 0:00:03.617774 ETA: 0:02:17.475430 INFO:tensorflow:Saving checkpoints for step-182. 2022-04-04 22:17:16,263 [INFO] tensorflow: Saving checkpoints for step-182. Epoch: 13/50:, Cur-Step: 190, loss(cross_entropy): 0.15514, Running average loss:0.15983, Time taken: 0:00:03.452905 ETA: 0:02:07.757474 2022-04-04 22:17:19,273 [INFO] __main__: Epoch: 13/50:, Cur-Step: 190, loss(cross_entropy): 0.15514, Running average loss:0.15983, Time taken: 0:00:03.452905 ETA: 0:02:07.757474 INFO:tensorflow:Saving checkpoints for step-196. 2022-04-04 22:17:19,627 [INFO] tensorflow: Saving checkpoints for step-196. Epoch: 14/50:, Cur-Step: 200, loss(cross_entropy): 0.11538, Running average loss:0.14001, Time taken: 0:00:03.434702 ETA: 0:02:03.649261 2022-04-04 22:17:22,365 [INFO] __main__: Epoch: 14/50:, Cur-Step: 200, loss(cross_entropy): 0.11538, Running average loss:0.14001, Time taken: 0:00:03.434702 ETA: 0:02:03.649261 INFO:tensorflow:Saving checkpoints for step-210. 2022-04-04 22:17:22,997 [INFO] tensorflow: Saving checkpoints for step-210. Epoch: 15/50:, Cur-Step: 210, loss(cross_entropy): 0.12739, Running average loss:0.12739, Time taken: 0:00:03.439476 ETA: 0:02:00.381652 2022-04-04 22:17:25,379 [INFO] __main__: Epoch: 15/50:, Cur-Step: 210, loss(cross_entropy): 0.12739, Running average loss:0.12739, Time taken: 0:00:03.439476 ETA: 0:02:00.381652 Epoch: 15/50:, Cur-Step: 220, loss(cross_entropy): 0.13134, Running average loss:0.11650, Time taken: 0:00:03.439476 ETA: 0:02:00.381652 2022-04-04 22:17:26,170 [INFO] __main__: Epoch: 15/50:, Cur-Step: 220, loss(cross_entropy): 0.13134, Running average loss:0.11650, Time taken: 0:00:03.439476 ETA: 0:02:00.381652 INFO:tensorflow:Saving checkpoints for step-224. 2022-04-04 22:17:26,384 [INFO] tensorflow: Saving checkpoints for step-224. Epoch: 16/50:, Cur-Step: 230, loss(cross_entropy): 0.10573, Running average loss:0.13746, Time taken: 0:00:03.457021 ETA: 0:01:57.538714 2022-04-04 22:17:29,417 [INFO] __main__: Epoch: 16/50:, Cur-Step: 230, loss(cross_entropy): 0.10573, Running average loss:0.13746, Time taken: 0:00:03.457021 ETA: 0:01:57.538714 INFO:tensorflow:Saving checkpoints for step-238. 2022-04-04 22:17:29,914 [INFO] tensorflow: Saving checkpoints for step-238. Epoch: 17/50:, Cur-Step: 240, loss(cross_entropy): 0.13378, Running average loss:0.11410, Time taken: 0:00:03.599775 ETA: 0:01:58.792570 2022-04-04 22:17:32,535 [INFO] __main__: Epoch: 17/50:, Cur-Step: 240, loss(cross_entropy): 0.13378, Running average loss:0.11410, Time taken: 0:00:03.599775 ETA: 0:01:58.792570 Epoch: 17/50:, Cur-Step: 250, loss(cross_entropy): 0.17311, Running average loss:0.14145, Time taken: 0:00:03.599775 ETA: 0:01:58.792570 2022-04-04 22:17:33,242 [INFO] __main__: Epoch: 17/50:, Cur-Step: 250, loss(cross_entropy): 0.17311, Running average loss:0.14145, Time taken: 0:00:03.599775 ETA: 0:01:58.792570 INFO:tensorflow:Saving checkpoints for step-252. 2022-04-04 22:17:33,370 [INFO] tensorflow: Saving checkpoints for step-252. Epoch: 18/50:, Cur-Step: 260, loss(cross_entropy): 0.13819, Running average loss:0.13922, Time taken: 0:00:03.519983 ETA: 0:01:52.639442 2022-04-04 22:17:36,331 [INFO] __main__: Epoch: 18/50:, Cur-Step: 260, loss(cross_entropy): 0.13819, Running average loss:0.13922, Time taken: 0:00:03.519983 ETA: 0:01:52.639442 INFO:tensorflow:Saving checkpoints for step-266. 2022-04-04 22:17:36,769 [INFO] tensorflow: Saving checkpoints for step-266. Epoch: 19/50:, Cur-Step: 270, loss(cross_entropy): 0.12684, Running average loss:0.13419, Time taken: 0:00:03.525974 ETA: 0:01:49.305188 2022-04-04 22:17:39,420 [INFO] __main__: Epoch: 19/50:, Cur-Step: 270, loss(cross_entropy): 0.12684, Running average loss:0.13419, Time taken: 0:00:03.525974 ETA: 0:01:49.305188 INFO:tensorflow:Saving checkpoints for step-280. 2022-04-04 22:17:40,118 [INFO] tensorflow: Saving checkpoints for step-280. Epoch: 20/50:, Cur-Step: 280, loss(cross_entropy): 0.24065, Running average loss:0.24065, Time taken: 0:00:03.455585 ETA: 0:01:43.667564 2022-04-04 22:17:42,562 [INFO] __main__: Epoch: 20/50:, Cur-Step: 280, loss(cross_entropy): 0.24065, Running average loss:0.24065, Time taken: 0:00:03.455585 ETA: 0:01:43.667564 Epoch: 20/50:, Cur-Step: 290, loss(cross_entropy): 0.12485, Running average loss:0.13809, Time taken: 0:00:03.455585 ETA: 0:01:43.667564 2022-04-04 22:17:43,269 [INFO] __main__: Epoch: 20/50:, Cur-Step: 290, loss(cross_entropy): 0.12485, Running average loss:0.13809, Time taken: 0:00:03.455585 ETA: 0:01:43.667564 INFO:tensorflow:Saving checkpoints for step-294. 2022-04-04 22:17:43,549 [INFO] tensorflow: Saving checkpoints for step-294. Epoch: 21/50:, Cur-Step: 300, loss(cross_entropy): 0.18123, Running average loss:0.14208, Time taken: 0:00:03.554652 ETA: 0:01:43.084914 2022-04-04 22:17:46,527 [INFO] __main__: Epoch: 21/50:, Cur-Step: 300, loss(cross_entropy): 0.18123, Running average loss:0.14208, Time taken: 0:00:03.554652 ETA: 0:01:43.084914 INFO:tensorflow:Saving checkpoints for step-308. 2022-04-04 22:17:47,123 [INFO] tensorflow: Saving checkpoints for step-308. Epoch: 22/50:, Cur-Step: 310, loss(cross_entropy): 0.11410, Running average loss:0.13977, Time taken: 0:00:03.642405 ETA: 0:01:41.987328 2022-04-04 22:17:49,628 [INFO] __main__: Epoch: 22/50:, Cur-Step: 310, loss(cross_entropy): 0.11410, Running average loss:0.13977, Time taken: 0:00:03.642405 ETA: 0:01:41.987328 Epoch: 22/50:, Cur-Step: 320, loss(cross_entropy): 0.10623, Running average loss:0.12281, Time taken: 0:00:03.642405 ETA: 0:01:41.987328 2022-04-04 22:17:50,401 [INFO] __main__: Epoch: 22/50:, Cur-Step: 320, loss(cross_entropy): 0.10623, Running average loss:0.12281, Time taken: 0:00:03.642405 ETA: 0:01:41.987328 INFO:tensorflow:Saving checkpoints for step-322. 2022-04-04 22:17:50,472 [INFO] tensorflow: Saving checkpoints for step-322. Epoch: 23/50:, Cur-Step: 330, loss(cross_entropy): 0.11926, Running average loss:0.15432, Time taken: 0:00:03.418862 ETA: 0:01:32.309270 2022-04-04 22:17:53,457 [INFO] __main__: Epoch: 23/50:, Cur-Step: 330, loss(cross_entropy): 0.11926, Running average loss:0.15432, Time taken: 0:00:03.418862 ETA: 0:01:32.309270 INFO:tensorflow:Saving checkpoints for step-336. 2022-04-04 22:17:53,887 [INFO] tensorflow: Saving checkpoints for step-336. Epoch: 24/50:, Cur-Step: 340, loss(cross_entropy): 0.11528, Running average loss:0.13029, Time taken: 0:00:03.484240 ETA: 0:01:30.590235 2022-04-04 22:17:56,598 [INFO] __main__: Epoch: 24/50:, Cur-Step: 340, loss(cross_entropy): 0.11528, Running average loss:0.13029, Time taken: 0:00:03.484240 ETA: 0:01:30.590235 INFO:tensorflow:Saving checkpoints for step-350. 2022-04-04 22:17:57,326 [INFO] tensorflow: Saving checkpoints for step-350. Epoch: 25/50:, Cur-Step: 350, loss(cross_entropy): 0.17276, Running average loss:0.17276, Time taken: 0:00:03.509134 ETA: 0:01:27.728339 2022-04-04 22:17:59,753 [INFO] __main__: Epoch: 25/50:, Cur-Step: 350, loss(cross_entropy): 0.17276, Running average loss:0.17276, Time taken: 0:00:03.509134 ETA: 0:01:27.728339 Epoch: 25/50:, Cur-Step: 360, loss(cross_entropy): 0.21940, Running average loss:0.14604, Time taken: 0:00:03.509134 ETA: 0:01:27.728339 2022-04-04 22:18:00,555 [INFO] __main__: Epoch: 25/50:, Cur-Step: 360, loss(cross_entropy): 0.21940, Running average loss:0.14604, Time taken: 0:00:03.509134 ETA: 0:01:27.728339 INFO:tensorflow:Saving checkpoints for step-364. 2022-04-04 22:18:00,770 [INFO] tensorflow: Saving checkpoints for step-364. Epoch: 26/50:, Cur-Step: 370, loss(cross_entropy): 0.12968, Running average loss:0.14245, Time taken: 0:00:03.514439 ETA: 0:01:24.346544 2022-04-04 22:18:03,755 [INFO] __main__: Epoch: 26/50:, Cur-Step: 370, loss(cross_entropy): 0.12968, Running average loss:0.14245, Time taken: 0:00:03.514439 ETA: 0:01:24.346544 INFO:tensorflow:Saving checkpoints for step-378. 2022-04-04 22:18:04,321 [INFO] tensorflow: Saving checkpoints for step-378. Epoch: 27/50:, Cur-Step: 380, loss(cross_entropy): 0.09991, Running average loss:0.12556, Time taken: 0:00:03.620800 ETA: 0:01:23.278395 2022-04-04 22:18:06,839 [INFO] __main__: Epoch: 27/50:, Cur-Step: 380, loss(cross_entropy): 0.09991, Running average loss:0.12556, Time taken: 0:00:03.620800 ETA: 0:01:23.278395 Epoch: 27/50:, Cur-Step: 390, loss(cross_entropy): 0.12082, Running average loss:0.14010, Time taken: 0:00:03.620800 ETA: 0:01:23.278395 2022-04-04 22:18:07,642 [INFO] __main__: Epoch: 27/50:, Cur-Step: 390, loss(cross_entropy): 0.12082, Running average loss:0.14010, Time taken: 0:00:03.620800 ETA: 0:01:23.278395 INFO:tensorflow:Saving checkpoints for step-392. 2022-04-04 22:18:07,715 [INFO] tensorflow: Saving checkpoints for step-392. Epoch: 28/50:, Cur-Step: 400, loss(cross_entropy): 0.12549, Running average loss:0.14316, Time taken: 0:00:03.462762 ETA: 0:01:16.180772 2022-04-04 22:18:10,759 [INFO] __main__: Epoch: 28/50:, Cur-Step: 400, loss(cross_entropy): 0.12549, Running average loss:0.14316, Time taken: 0:00:03.462762 ETA: 0:01:16.180772 INFO:tensorflow:Saving checkpoints for step-406. 2022-04-04 22:18:11,113 [INFO] tensorflow: Saving checkpoints for step-406. Epoch: 29/50:, Cur-Step: 410, loss(cross_entropy): 0.13606, Running average loss:0.14395, Time taken: 0:00:03.468647 ETA: 0:01:12.841597 2022-04-04 22:18:13,797 [INFO] __main__: Epoch: 29/50:, Cur-Step: 410, loss(cross_entropy): 0.13606, Running average loss:0.14395, Time taken: 0:00:03.468647 ETA: 0:01:12.841597 INFO:tensorflow:Saving checkpoints for step-420. 2022-04-04 22:18:14,489 [INFO] tensorflow: Saving checkpoints for step-420. Epoch: 30/50:, Cur-Step: 420, loss(cross_entropy): 0.21838, Running average loss:0.21838, Time taken: 0:00:03.446214 ETA: 0:01:08.924284 2022-04-04 22:18:16,861 [INFO] __main__: Epoch: 30/50:, Cur-Step: 420, loss(cross_entropy): 0.21838, Running average loss:0.21838, Time taken: 0:00:03.446214 ETA: 0:01:08.924284 Epoch: 30/50:, Cur-Step: 430, loss(cross_entropy): 0.08654, Running average loss:0.12284, Time taken: 0:00:03.446214 ETA: 0:01:08.924284 2022-04-04 22:18:17,650 [INFO] __main__: Epoch: 30/50:, Cur-Step: 430, loss(cross_entropy): 0.08654, Running average loss:0.12284, Time taken: 0:00:03.446214 ETA: 0:01:08.924284 INFO:tensorflow:Saving checkpoints for step-434. 2022-04-04 22:18:17,866 [INFO] tensorflow: Saving checkpoints for step-434. Epoch: 31/50:, Cur-Step: 440, loss(cross_entropy): 0.12405, Running average loss:0.14114, Time taken: 0:00:03.446021 ETA: 0:01:05.474405 2022-04-04 22:18:20,714 [INFO] __main__: Epoch: 31/50:, Cur-Step: 440, loss(cross_entropy): 0.12405, Running average loss:0.14114, Time taken: 0:00:03.446021 ETA: 0:01:05.474405 INFO:tensorflow:Saving checkpoints for step-448. 2022-04-04 22:18:21,207 [INFO] tensorflow: Saving checkpoints for step-448. Epoch: 32/50:, Cur-Step: 450, loss(cross_entropy): 0.11512, Running average loss:0.11240, Time taken: 0:00:03.409501 ETA: 0:01:01.371019 2022-04-04 22:18:23,705 [INFO] __main__: Epoch: 32/50:, Cur-Step: 450, loss(cross_entropy): 0.11512, Running average loss:0.11240, Time taken: 0:00:03.409501 ETA: 0:01:01.371019 Epoch: 32/50:, Cur-Step: 460, loss(cross_entropy): 0.08534, Running average loss:0.13475, Time taken: 0:00:03.409501 ETA: 0:01:01.371019 2022-04-04 22:18:24,488 [INFO] __main__: Epoch: 32/50:, Cur-Step: 460, loss(cross_entropy): 0.08534, Running average loss:0.13475, Time taken: 0:00:03.409501 ETA: 0:01:01.371019 INFO:tensorflow:Saving checkpoints for step-462. 2022-04-04 22:18:24,561 [INFO] tensorflow: Saving checkpoints for step-462. Epoch: 33/50:, Cur-Step: 470, loss(cross_entropy): 0.09924, Running average loss:0.11465, Time taken: 0:00:03.423143 ETA: 0:00:58.193438 2022-04-04 22:18:27,577 [INFO] __main__: Epoch: 33/50:, Cur-Step: 470, loss(cross_entropy): 0.09924, Running average loss:0.11465, Time taken: 0:00:03.423143 ETA: 0:00:58.193438 INFO:tensorflow:Saving checkpoints for step-476. 2022-04-04 22:18:27,933 [INFO] tensorflow: Saving checkpoints for step-476. Epoch: 34/50:, Cur-Step: 480, loss(cross_entropy): 0.11594, Running average loss:0.16393, Time taken: 0:00:03.443151 ETA: 0:00:55.090408 2022-04-04 22:18:30,766 [INFO] __main__: Epoch: 34/50:, Cur-Step: 480, loss(cross_entropy): 0.11594, Running average loss:0.16393, Time taken: 0:00:03.443151 ETA: 0:00:55.090408 INFO:tensorflow:Saving checkpoints for step-490. 2022-04-04 22:18:31,429 [INFO] tensorflow: Saving checkpoints for step-490. Epoch: 35/50:, Cur-Step: 490, loss(cross_entropy): 0.10231, Running average loss:0.10231, Time taken: 0:00:03.566047 ETA: 0:00:53.490704 2022-04-04 22:18:34,028 [INFO] __main__: Epoch: 35/50:, Cur-Step: 490, loss(cross_entropy): 0.10231, Running average loss:0.10231, Time taken: 0:00:03.566047 ETA: 0:00:53.490704 Epoch: 35/50:, Cur-Step: 500, loss(cross_entropy): 0.09222, Running average loss:0.12526, Time taken: 0:00:03.566047 ETA: 0:00:53.490704 2022-04-04 22:18:34,840 [INFO] __main__: Epoch: 35/50:, Cur-Step: 500, loss(cross_entropy): 0.09222, Running average loss:0.12526, Time taken: 0:00:03.566047 ETA: 0:00:53.490704 INFO:tensorflow:Saving checkpoints for step-504. 2022-04-04 22:18:35,072 [INFO] tensorflow: Saving checkpoints for step-504. Epoch: 36/50:, Cur-Step: 510, loss(cross_entropy): 0.07719, Running average loss:0.13793, Time taken: 0:00:03.711544 ETA: 0:00:51.961613 2022-04-04 22:18:37,951 [INFO] __main__: Epoch: 36/50:, Cur-Step: 510, loss(cross_entropy): 0.07719, Running average loss:0.13793, Time taken: 0:00:03.711544 ETA: 0:00:51.961613 INFO:tensorflow:Saving checkpoints for step-518. 2022-04-04 22:18:38,442 [INFO] tensorflow: Saving checkpoints for step-518. Epoch: 37/50:, Cur-Step: 520, loss(cross_entropy): 0.14381, Running average loss:0.14802, Time taken: 0:00:03.444330 ETA: 0:00:44.776296 2022-04-04 22:18:41,049 [INFO] __main__: Epoch: 37/50:, Cur-Step: 520, loss(cross_entropy): 0.14381, Running average loss:0.14802, Time taken: 0:00:03.444330 ETA: 0:00:44.776296 Epoch: 37/50:, Cur-Step: 530, loss(cross_entropy): 0.12108, Running average loss:0.13837, Time taken: 0:00:03.444330 ETA: 0:00:44.776296 2022-04-04 22:18:41,747 [INFO] __main__: Epoch: 37/50:, Cur-Step: 530, loss(cross_entropy): 0.12108, Running average loss:0.13837, Time taken: 0:00:03.444330 ETA: 0:00:44.776296 INFO:tensorflow:Saving checkpoints for step-532. 2022-04-04 22:18:41,860 [INFO] tensorflow: Saving checkpoints for step-532. Epoch: 38/50:, Cur-Step: 540, loss(cross_entropy): 0.13250, Running average loss:0.12890, Time taken: 0:00:03.478696 ETA: 0:00:41.744348 2022-04-04 22:18:44,811 [INFO] __main__: Epoch: 38/50:, Cur-Step: 540, loss(cross_entropy): 0.13250, Running average loss:0.12890, Time taken: 0:00:03.478696 ETA: 0:00:41.744348 INFO:tensorflow:Saving checkpoints for step-546. 2022-04-04 22:18:45,238 [INFO] tensorflow: Saving checkpoints for step-546. Epoch: 39/50:, Cur-Step: 550, loss(cross_entropy): 0.16321, Running average loss:0.12217, Time taken: 0:00:03.489474 ETA: 0:00:38.384209 2022-04-04 22:18:47,886 [INFO] __main__: Epoch: 39/50:, Cur-Step: 550, loss(cross_entropy): 0.16321, Running average loss:0.12217, Time taken: 0:00:03.489474 ETA: 0:00:38.384209 INFO:tensorflow:Saving checkpoints for step-560. 2022-04-04 22:18:48,585 [INFO] tensorflow: Saving checkpoints for step-560. Epoch: 40/50:, Cur-Step: 560, loss(cross_entropy): 0.07088, Running average loss:0.07088, Time taken: 0:00:03.454730 ETA: 0:00:34.547300 2022-04-04 22:18:51,080 [INFO] __main__: Epoch: 40/50:, Cur-Step: 560, loss(cross_entropy): 0.07088, Running average loss:0.07088, Time taken: 0:00:03.454730 ETA: 0:00:34.547300 Epoch: 40/50:, Cur-Step: 570, loss(cross_entropy): 0.10161, Running average loss:0.13799, Time taken: 0:00:03.454730 ETA: 0:00:34.547300 2022-04-04 22:18:51,785 [INFO] __main__: Epoch: 40/50:, Cur-Step: 570, loss(cross_entropy): 0.10161, Running average loss:0.13799, Time taken: 0:00:03.454730 ETA: 0:00:34.547300 INFO:tensorflow:Saving checkpoints for step-574. 2022-04-04 22:18:52,055 [INFO] tensorflow: Saving checkpoints for step-574. Epoch: 41/50:, Cur-Step: 580, loss(cross_entropy): 0.11347, Running average loss:0.13529, Time taken: 0:00:03.569641 ETA: 0:00:32.126770 2022-04-04 22:18:54,840 [INFO] __main__: Epoch: 41/50:, Cur-Step: 580, loss(cross_entropy): 0.11347, Running average loss:0.13529, Time taken: 0:00:03.569641 ETA: 0:00:32.126770 INFO:tensorflow:Saving checkpoints for step-588. 2022-04-04 22:18:55,413 [INFO] tensorflow: Saving checkpoints for step-588. Epoch: 42/50:, Cur-Step: 590, loss(cross_entropy): 0.10178, Running average loss:0.10766, Time taken: 0:00:03.426452 ETA: 0:00:27.411619 2022-04-04 22:18:57,870 [INFO] __main__: Epoch: 42/50:, Cur-Step: 590, loss(cross_entropy): 0.10178, Running average loss:0.10766, Time taken: 0:00:03.426452 ETA: 0:00:27.411619 Epoch: 42/50:, Cur-Step: 600, loss(cross_entropy): 0.12954, Running average loss:0.13044, Time taken: 0:00:03.426452 ETA: 0:00:27.411619 2022-04-04 22:18:58,659 [INFO] __main__: Epoch: 42/50:, Cur-Step: 600, loss(cross_entropy): 0.12954, Running average loss:0.13044, Time taken: 0:00:03.426452 ETA: 0:00:27.411619 INFO:tensorflow:Saving checkpoints for step-602. 2022-04-04 22:18:58,731 [INFO] tensorflow: Saving checkpoints for step-602. Epoch: 43/50:, Cur-Step: 610, loss(cross_entropy): 0.13411, Running average loss:0.12082, Time taken: 0:00:03.387674 ETA: 0:00:23.713719 2022-04-04 22:19:01,666 [INFO] __main__: Epoch: 43/50:, Cur-Step: 610, loss(cross_entropy): 0.13411, Running average loss:0.12082, Time taken: 0:00:03.387674 ETA: 0:00:23.713719 INFO:tensorflow:Saving checkpoints for step-616. 2022-04-04 22:19:02,096 [INFO] tensorflow: Saving checkpoints for step-616. Epoch: 44/50:, Cur-Step: 620, loss(cross_entropy): 0.10152, Running average loss:0.11897, Time taken: 0:00:03.435690 ETA: 0:00:20.614142 2022-04-04 22:19:04,740 [INFO] __main__: Epoch: 44/50:, Cur-Step: 620, loss(cross_entropy): 0.10152, Running average loss:0.11897, Time taken: 0:00:03.435690 ETA: 0:00:20.614142 INFO:tensorflow:Saving checkpoints for step-630. 2022-04-04 22:19:05,448 [INFO] tensorflow: Saving checkpoints for step-630. Epoch: 45/50:, Cur-Step: 630, loss(cross_entropy): 0.09567, Running average loss:0.09567, Time taken: 0:00:03.422405 ETA: 0:00:17.112027 2022-04-04 22:19:07,795 [INFO] __main__: Epoch: 45/50:, Cur-Step: 630, loss(cross_entropy): 0.09567, Running average loss:0.09567, Time taken: 0:00:03.422405 ETA: 0:00:17.112027 Epoch: 45/50:, Cur-Step: 640, loss(cross_entropy): 0.16028, Running average loss:0.13288, Time taken: 0:00:03.422405 ETA: 0:00:17.112027 2022-04-04 22:19:08,580 [INFO] __main__: Epoch: 45/50:, Cur-Step: 640, loss(cross_entropy): 0.16028, Running average loss:0.13288, Time taken: 0:00:03.422405 ETA: 0:00:17.112027 INFO:tensorflow:Saving checkpoints for step-644. 2022-04-04 22:19:08,792 [INFO] tensorflow: Saving checkpoints for step-644. Epoch: 46/50:, Cur-Step: 650, loss(cross_entropy): 0.09016, Running average loss:0.14360, Time taken: 0:00:03.412733 ETA: 0:00:13.650931 2022-04-04 22:19:11,567 [INFO] __main__: Epoch: 46/50:, Cur-Step: 650, loss(cross_entropy): 0.09016, Running average loss:0.14360, Time taken: 0:00:03.412733 ETA: 0:00:13.650931 INFO:tensorflow:Saving checkpoints for step-658. 2022-04-04 22:19:12,144 [INFO] tensorflow: Saving checkpoints for step-658. Epoch: 47/50:, Cur-Step: 660, loss(cross_entropy): 0.15755, Running average loss:0.13269, Time taken: 0:00:03.422488 ETA: 0:00:10.267464 2022-04-04 22:19:14,673 [INFO] __main__: Epoch: 47/50:, Cur-Step: 660, loss(cross_entropy): 0.15755, Running average loss:0.13269, Time taken: 0:00:03.422488 ETA: 0:00:10.267464 Epoch: 47/50:, Cur-Step: 670, loss(cross_entropy): 0.07689, Running average loss:0.11388, Time taken: 0:00:03.422488 ETA: 0:00:10.267464 2022-04-04 22:19:15,459 [INFO] __main__: Epoch: 47/50:, Cur-Step: 670, loss(cross_entropy): 0.07689, Running average loss:0.11388, Time taken: 0:00:03.422488 ETA: 0:00:10.267464 INFO:tensorflow:Saving checkpoints for step-672. 2022-04-04 22:19:15,532 [INFO] tensorflow: Saving checkpoints for step-672. Epoch: 48/50:, Cur-Step: 680, loss(cross_entropy): 0.12327, Running average loss:0.13725, Time taken: 0:00:03.456622 ETA: 0:00:06.913244 2022-04-04 22:19:18,541 [INFO] __main__: Epoch: 48/50:, Cur-Step: 680, loss(cross_entropy): 0.12327, Running average loss:0.13725, Time taken: 0:00:03.456622 ETA: 0:00:06.913244 INFO:tensorflow:Saving checkpoints for step-686. 2022-04-04 22:19:18,892 [INFO] tensorflow: Saving checkpoints for step-686. Epoch: 49/50:, Cur-Step: 690, loss(cross_entropy): 0.15014, Running average loss:0.14538, Time taken: 0:00:03.431798 ETA: 0:00:03.431798 2022-04-04 22:19:21,535 [INFO] __main__: Epoch: 49/50:, Cur-Step: 690, loss(cross_entropy): 0.15014, Running average loss:0.14538, Time taken: 0:00:03.431798 ETA: 0:00:03.431798 INFO:tensorflow:Saving checkpoints for step-700. 2022-04-04 22:19:22,240 [INFO] tensorflow: Saving checkpoints for step-700. Throughput Avg: 40.229 img/s Latency Avg: 84.565 ms Latency 90%: 95.361 ms Latency 95%: 97.429 ms Latency 99%: 101.471 ms DLL 2022-04-04 22:19:24.484207 - () throughput_train:40.22860874047034 latency_train:84.56539050362824 elapsed_time:192.86995 INFO:tensorflow:Loss for final step: 0.18393885. 2022-04-04 22:19:24,570 [INFO] tensorflow: Loss for final step: 0.18393885. Saving the final step model to /workspace/tao-experiments//unpruned/weights/model.tlt 2022-04-04 22:19:24,571 [INFO] __main__: Saving the final step model to /workspace/tao-experiments//unpruned/weights/model.tlt 2022-04-05 01:19:26,910 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.