Please provide the following information when requesting support.
• Hardware: RTXA4000
• Network Type: Detectnet_v2
• TAO Version: 5.0.0
I’m following along with the detectnet_v2 Jupyter notebook trying to train on a custom dataset. I have 74 png images and labels converted to the KITTI format.
This is the output containing the error when I try to run the training step:
2023-08-29 16:08:52,026 [TAO Toolkit] [INFO] root 160: Registry: ['nvcr.io']
2023-08-29 16:08:52,067 [TAO Toolkit] [INFO] nvidia_tao_cli.components.instance_handler.local_instance 360: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:5.0.0-tf1.15.5
2023-08-29 16:08:52,092 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 275: Printing tty value True
2023-08-29 20:08:52.860937: I tensorflow/stream_executor/platform/default/dso_loader.cc:50] Successfully opened dynamic library libcudart.so.12
2023-08-29 20:08:52,891 [TAO Toolkit] [WARNING] tensorflow 40: Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
Using TensorFlow backend.
2023-08-29 20:08:53,971 [TAO Toolkit] [WARNING] tensorflow 43: TensorFlow will not use sklearn by default. This improves performance in some cases. To enable sklearn export the environment variable TF_ALLOW_IOLIBS=1.
2023-08-29 20:08:53,995 [TAO Toolkit] [WARNING] tensorflow 42: TensorFlow will not use Dask by default. This improves performance in some cases. To enable Dask export the environment variable TF_ALLOW_IOLIBS=1.
2023-08-29 20:08:53,997 [TAO Toolkit] [WARNING] tensorflow 43: TensorFlow will not use Pandas by default. This improves performance in some cases. To enable Pandas export the environment variable TF_ALLOW_IOLIBS=1.
2023-08-29 20:08:55,174 [TAO Toolkit] [WARNING] matplotlib 500: Matplotlib created a temporary config/cache directory at /tmp/matplotlib-0tu44y1q because the default path (/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
2023-08-29 20:08:55,333 [TAO Toolkit] [INFO] matplotlib.font_manager 1633: generated new fontManager
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
Using TensorFlow backend.
WARNING:tensorflow:TensorFlow will not use sklearn by default. This improves performance in some cases. To enable sklearn export the environment variable TF_ALLOW_IOLIBS=1.
2023-08-29 20:08:56,734 [TAO Toolkit] [WARNING] tensorflow 43: TensorFlow will not use sklearn by default. This improves performance in some cases. To enable sklearn export the environment variable TF_ALLOW_IOLIBS=1.
WARNING:tensorflow:TensorFlow will not use Dask by default. This improves performance in some cases. To enable Dask export the environment variable TF_ALLOW_IOLIBS=1.
2023-08-29 20:08:56,756 [TAO Toolkit] [WARNING] tensorflow 42: TensorFlow will not use Dask by default. This improves performance in some cases. To enable Dask export the environment variable TF_ALLOW_IOLIBS=1.
WARNING:tensorflow:TensorFlow will not use Pandas by default. This improves performance in some cases. To enable Pandas export the environment variable TF_ALLOW_IOLIBS=1.
2023-08-29 20:08:56,758 [TAO Toolkit] [WARNING] tensorflow 43: TensorFlow will not use Pandas by default. This improves performance in some cases. To enable Pandas export the environment variable TF_ALLOW_IOLIBS=1.
2023-08-29 20:08:57,825 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.common.logging.logging 197: Log file already exists at /workspace/tao-experiments/detectnet_v2/experiment_dir_unpruned/status.json
2023-08-29 20:08:57,826 [TAO Toolkit] [INFO] root 2102: Starting DetectNet_v2 Training job
2023-08-29 20:08:57,827 [TAO Toolkit] [INFO] __main__ 817: Loading experiment spec at /workspace/tao-experiments/detectnet_v2/specs/detectnet_v2_train_resnet18_kitti.txt.
2023-08-29 20:08:57,828 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.spec_handler.spec_loader 113: Merging specification from /workspace/tao-experiments/detectnet_v2/specs/detectnet_v2_train_resnet18_kitti.txt
2023-08-29 20:08:57,830 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.common.mlops.wandb 69: Initializing wandb.
2023-08-29 20:08:57,830 [TAO Toolkit] [WARNING] nvidia_tao_tf1.cv.common.mlops.wandb 97: Wandb logging failed with error WandB client wasn't logged in. Please make sure to set the WANDB_API_KEY env variable or run `wandb login` in over the CLI and copy the ~/.netrc file to the container.
2023-08-29 20:08:57,831 [TAO Toolkit] [INFO] __main__ 857: Integrating with clearml.
2023-08-29 20:08:57,924 [TAO Toolkit] [WARNING] nvidia_tao_tf1.cv.common.mlops.clearml 55: ClearML task init failed with error ClearML configuration could not be found (missing `~/clearml.conf` or Environment CLEARML_API_HOST)
To get started with ClearML: setup your own `clearml-server`, or create a free account at https://app.clear.ml
2023-08-29 20:08:57,924 [TAO Toolkit] [WARNING] nvidia_tao_tf1.cv.common.mlops.clearml 58: Training will still continue.
2023-08-29 20:08:57,925 [TAO Toolkit] [INFO] root 2102: Training gridbox model.
WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/keras/backend/tensorflow_backend.py:153: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.
2023-08-29 20:08:57,925 [TAO Toolkit] [WARNING] tensorflow 137: From /usr/local/lib/python3.8/dist-packages/keras/backend/tensorflow_backend.py:153: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.
2023-08-29 20:08:58,026 [TAO Toolkit] [INFO] root 522: Sampling mode of the dataloader was set to user_defined.
WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/cost_function/cost_auto_weight_hook.py:122: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead.
2023-08-29 20:08:58,026 [TAO Toolkit] [WARNING] tensorflow 137: From /usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/cost_function/cost_auto_weight_hook.py:122: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead.
WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/cost_function/cost_auto_weight_hook.py:125: The name tf.get_variable is deprecated. Please use tf.compat.v1.get_variable instead.
2023-08-29 20:08:58,027 [TAO Toolkit] [WARNING] tensorflow 137: From /usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/cost_function/cost_auto_weight_hook.py:125: The name tf.get_variable is deprecated. Please use tf.compat.v1.get_variable instead.
WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/cost_function/cost_auto_weight_hook.py:128: The name tf.assign is deprecated. Please use tf.compat.v1.assign instead.
2023-08-29 20:08:58,028 [TAO Toolkit] [WARNING] tensorflow 137: From /usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/cost_function/cost_auto_weight_hook.py:128: The name tf.assign is deprecated. Please use tf.compat.v1.assign instead.
2023-08-29 20:08:58,032 [TAO Toolkit] [INFO] root 2102: Building DetectNet V2 model
WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/keras/backend/tensorflow_backend.py:517: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.
2023-08-29 20:08:58,032 [TAO Toolkit] [WARNING] tensorflow 137: From /usr/local/lib/python3.8/dist-packages/keras/backend/tensorflow_backend.py:517: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.
WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/keras/backend/tensorflow_backend.py:4138: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.
2023-08-29 20:08:58,033 [TAO Toolkit] [WARNING] tensorflow 137: From /usr/local/lib/python3.8/dist-packages/keras/backend/tensorflow_backend.py:4138: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.
WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/keras/backend/tensorflow_backend.py:1834: The name tf.nn.fused_batch_norm is deprecated. Please use tf.compat.v1.nn.fused_batch_norm instead.
2023-08-29 20:08:58,045 [TAO Toolkit] [WARNING] tensorflow 137: From /usr/local/lib/python3.8/dist-packages/keras/backend/tensorflow_backend.py:1834: The name tf.nn.fused_batch_norm is deprecated. Please use tf.compat.v1.nn.fused_batch_norm instead.
WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/third_party/keras/tensorflow_backend.py:199: The name tf.nn.avg_pool is deprecated. Please use tf.nn.avg_pool2d instead.
2023-08-29 20:08:58,704 [TAO Toolkit] [WARNING] tensorflow 137: From /usr/local/lib/python3.8/dist-packages/third_party/keras/tensorflow_backend.py:199: The name tf.nn.avg_pool is deprecated. Please use tf.nn.avg_pool2d instead.
WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/keras/backend/tensorflow_backend.py:174: The name tf.get_default_session is deprecated. Please use tf.compat.v1.get_default_session instead.
2023-08-29 20:08:58,900 [TAO Toolkit] [WARNING] tensorflow 137: From /usr/local/lib/python3.8/dist-packages/keras/backend/tensorflow_backend.py:174: The name tf.get_default_session is deprecated. Please use tf.compat.v1.get_default_session instead.
WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/keras/backend/tensorflow_backend.py:190: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead.
2023-08-29 20:08:58,900 [TAO Toolkit] [WARNING] tensorflow 137: From /usr/local/lib/python3.8/dist-packages/keras/backend/tensorflow_backend.py:190: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead.
WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/keras/backend/tensorflow_backend.py:199: The name tf.is_variable_initialized is deprecated. Please use tf.compat.v1.is_variable_initialized instead.
2023-08-29 20:08:58,901 [TAO Toolkit] [WARNING] tensorflow 137: From /usr/local/lib/python3.8/dist-packages/keras/backend/tensorflow_backend.py:199: The name tf.is_variable_initialized is deprecated. Please use tf.compat.v1.is_variable_initialized instead.
WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/keras/backend/tensorflow_backend.py:206: The name tf.variables_initializer is deprecated. Please use tf.compat.v1.variables_initializer instead.
2023-08-29 20:08:59,260 [TAO Toolkit] [WARNING] tensorflow 137: From /usr/local/lib/python3.8/dist-packages/keras/backend/tensorflow_backend.py:206: The name tf.variables_initializer is deprecated. Please use tf.compat.v1.variables_initializer instead.
2023-08-29 20:08:59,670 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 133: Loading weights from pretrained model file. /workspace/tao-experiments/detectnet_v2/pretrained_resnet18/pretrained_detectnet_v2_vresnet18/resnet18.hdf5
2023-08-29 20:08:59,670 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer input_1 weights set from pre-trained model.
2023-08-29 20:08:59,771 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer conv1 weights set from pre-trained model.
2023-08-29 20:08:59,869 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer bn_conv1 weights set from pre-trained model.
2023-08-29 20:08:59,869 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer activation_1 weights set from pre-trained model.
2023-08-29 20:08:59,976 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_1a_conv_1 weights set from pre-trained model.
2023-08-29 20:09:00,091 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_1a_bn_1 weights set from pre-trained model.
2023-08-29 20:09:00,193 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_1a_conv_2 weights set from pre-trained model.
2023-08-29 20:09:00,304 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_1a_conv_shortcut weights set from pre-trained model.
2023-08-29 20:09:00,420 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_1a_bn_2 weights set from pre-trained model.
2023-08-29 20:09:00,532 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_1a_bn_shortcut weights set from pre-trained model.
2023-08-29 20:09:00,532 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer add_1 weights set from pre-trained model.
2023-08-29 20:09:00,632 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_1b_conv_1 weights set from pre-trained model.
2023-08-29 20:09:00,756 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_1b_bn_1 weights set from pre-trained model.
2023-08-29 20:09:00,877 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_1b_conv_2 weights set from pre-trained model.
2023-08-29 20:09:01,006 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_1b_bn_2 weights set from pre-trained model.
2023-08-29 20:09:01,006 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer add_2 weights set from pre-trained model.
2023-08-29 20:09:01,131 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_2a_conv_1 weights set from pre-trained model.
2023-08-29 20:09:01,238 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_2a_bn_1 weights set from pre-trained model.
2023-08-29 20:09:01,344 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_2a_conv_2 weights set from pre-trained model.
2023-08-29 20:09:01,460 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_2a_conv_shortcut weights set from pre-trained model.
2023-08-29 20:09:01,571 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_2a_bn_2 weights set from pre-trained model.
2023-08-29 20:09:01,681 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_2a_bn_shortcut weights set from pre-trained model.
2023-08-29 20:09:01,681 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer add_3 weights set from pre-trained model.
2023-08-29 20:09:01,798 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_2b_conv_1 weights set from pre-trained model.
2023-08-29 20:09:01,913 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_2b_bn_1 weights set from pre-trained model.
2023-08-29 20:09:02,022 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_2b_conv_2 weights set from pre-trained model.
2023-08-29 20:09:02,139 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_2b_bn_2 weights set from pre-trained model.
2023-08-29 20:09:02,139 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer add_4 weights set from pre-trained model.
2023-08-29 20:09:02,271 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_3a_conv_1 weights set from pre-trained model.
2023-08-29 20:09:02,407 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_3a_bn_1 weights set from pre-trained model.
2023-08-29 20:09:02,535 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_3a_conv_2 weights set from pre-trained model.
2023-08-29 20:09:02,651 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_3a_conv_shortcut weights set from pre-trained model.
2023-08-29 20:09:02,780 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_3a_bn_2 weights set from pre-trained model.
2023-08-29 20:09:02,894 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_3a_bn_shortcut weights set from pre-trained model.
2023-08-29 20:09:02,894 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer add_5 weights set from pre-trained model.
2023-08-29 20:09:03,001 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_3b_conv_1 weights set from pre-trained model.
2023-08-29 20:09:03,135 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_3b_bn_1 weights set from pre-trained model.
2023-08-29 20:09:03,253 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_3b_conv_2 weights set from pre-trained model.
2023-08-29 20:09:03,368 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_3b_bn_2 weights set from pre-trained model.
2023-08-29 20:09:03,368 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer add_6 weights set from pre-trained model.
2023-08-29 20:09:03,493 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_4a_conv_1 weights set from pre-trained model.
2023-08-29 20:09:03,616 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_4a_bn_1 weights set from pre-trained model.
2023-08-29 20:09:03,725 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_4a_conv_2 weights set from pre-trained model.
2023-08-29 20:09:03,842 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_4a_conv_shortcut weights set from pre-trained model.
2023-08-29 20:09:03,967 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_4a_bn_2 weights set from pre-trained model.
2023-08-29 20:09:04,087 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_4a_bn_shortcut weights set from pre-trained model.
2023-08-29 20:09:04,088 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer add_7 weights set from pre-trained model.
2023-08-29 20:09:04,214 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_4b_conv_1 weights set from pre-trained model.
2023-08-29 20:09:04,340 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_4b_bn_1 weights set from pre-trained model.
2023-08-29 20:09:04,453 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_4b_conv_2 weights set from pre-trained model.
2023-08-29 20:09:04,579 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer block_4b_bn_2 weights set from pre-trained model.
2023-08-29 20:09:04,579 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.model.detectnet_model 142: Layer add_8 weights set from pre-trained model.
2023-08-29 20:09:04,653 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.objectives.bbox_objective 78: Default L1 loss function will be used.
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) (None, 3, 256, 256) 0
__________________________________________________________________________________________________
conv1 (Conv2D) (None, 64, 128, 128) 9472 input_1[0][0]
__________________________________________________________________________________________________
bn_conv1 (BatchNormalization) (None, 64, 128, 128) 256 conv1[0][0]
__________________________________________________________________________________________________
activation_1 (Activation) (None, 64, 128, 128) 0 bn_conv1[0][0]
__________________________________________________________________________________________________
block_1a_conv_1 (Conv2D) (None, 64, 64, 64) 36928 activation_1[0][0]
__________________________________________________________________________________________________
block_1a_bn_1 (BatchNormalizati (None, 64, 64, 64) 256 block_1a_conv_1[0][0]
__________________________________________________________________________________________________
block_1a_relu_1 (Activation) (None, 64, 64, 64) 0 block_1a_bn_1[0][0]
__________________________________________________________________________________________________
block_1a_conv_2 (Conv2D) (None, 64, 64, 64) 36928 block_1a_relu_1[0][0]
__________________________________________________________________________________________________
block_1a_conv_shortcut (Conv2D) (None, 64, 64, 64) 4160 activation_1[0][0]
__________________________________________________________________________________________________
block_1a_bn_2 (BatchNormalizati (None, 64, 64, 64) 256 block_1a_conv_2[0][0]
__________________________________________________________________________________________________
block_1a_bn_shortcut (BatchNorm (None, 64, 64, 64) 256 block_1a_conv_shortcut[0][0]
__________________________________________________________________________________________________
add_1 (Add) (None, 64, 64, 64) 0 block_1a_bn_2[0][0]
block_1a_bn_shortcut[0][0]
__________________________________________________________________________________________________
block_1a_relu (Activation) (None, 64, 64, 64) 0 add_1[0][0]
__________________________________________________________________________________________________
block_1b_conv_1 (Conv2D) (None, 64, 64, 64) 36928 block_1a_relu[0][0]
__________________________________________________________________________________________________
block_1b_bn_1 (BatchNormalizati (None, 64, 64, 64) 256 block_1b_conv_1[0][0]
__________________________________________________________________________________________________
block_1b_relu_1 (Activation) (None, 64, 64, 64) 0 block_1b_bn_1[0][0]
__________________________________________________________________________________________________
block_1b_conv_2 (Conv2D) (None, 64, 64, 64) 36928 block_1b_relu_1[0][0]
__________________________________________________________________________________________________
block_1b_bn_2 (BatchNormalizati (None, 64, 64, 64) 256 block_1b_conv_2[0][0]
__________________________________________________________________________________________________
add_2 (Add) (None, 64, 64, 64) 0 block_1b_bn_2[0][0]
block_1a_relu[0][0]
__________________________________________________________________________________________________
block_1b_relu (Activation) (None, 64, 64, 64) 0 add_2[0][0]
__________________________________________________________________________________________________
block_2a_conv_1 (Conv2D) (None, 128, 32, 32) 73856 block_1b_relu[0][0]
__________________________________________________________________________________________________
block_2a_bn_1 (BatchNormalizati (None, 128, 32, 32) 512 block_2a_conv_1[0][0]
__________________________________________________________________________________________________
block_2a_relu_1 (Activation) (None, 128, 32, 32) 0 block_2a_bn_1[0][0]
__________________________________________________________________________________________________
block_2a_conv_2 (Conv2D) (None, 128, 32, 32) 147584 block_2a_relu_1[0][0]
__________________________________________________________________________________________________
block_2a_conv_shortcut (Conv2D) (None, 128, 32, 32) 8320 block_1b_relu[0][0]
__________________________________________________________________________________________________
block_2a_bn_2 (BatchNormalizati (None, 128, 32, 32) 512 block_2a_conv_2[0][0]
__________________________________________________________________________________________________
block_2a_bn_shortcut (BatchNorm (None, 128, 32, 32) 512 block_2a_conv_shortcut[0][0]
__________________________________________________________________________________________________
add_3 (Add) (None, 128, 32, 32) 0 block_2a_bn_2[0][0]
block_2a_bn_shortcut[0][0]
__________________________________________________________________________________________________
block_2a_relu (Activation) (None, 128, 32, 32) 0 add_3[0][0]
__________________________________________________________________________________________________
block_2b_conv_1 (Conv2D) (None, 128, 32, 32) 147584 block_2a_relu[0][0]
__________________________________________________________________________________________________
block_2b_bn_1 (BatchNormalizati (None, 128, 32, 32) 512 block_2b_conv_1[0][0]
__________________________________________________________________________________________________
block_2b_relu_1 (Activation) (None, 128, 32, 32) 0 block_2b_bn_1[0][0]
__________________________________________________________________________________________________
block_2b_conv_2 (Conv2D) (None, 128, 32, 32) 147584 block_2b_relu_1[0][0]
__________________________________________________________________________________________________
block_2b_bn_2 (BatchNormalizati (None, 128, 32, 32) 512 block_2b_conv_2[0][0]
__________________________________________________________________________________________________
add_4 (Add) (None, 128, 32, 32) 0 block_2b_bn_2[0][0]
block_2a_relu[0][0]
__________________________________________________________________________________________________
block_2b_relu (Activation) (None, 128, 32, 32) 0 add_4[0][0]
__________________________________________________________________________________________________
block_3a_conv_1 (Conv2D) (None, 256, 16, 16) 295168 block_2b_relu[0][0]
__________________________________________________________________________________________________
block_3a_bn_1 (BatchNormalizati (None, 256, 16, 16) 1024 block_3a_conv_1[0][0]
__________________________________________________________________________________________________
block_3a_relu_1 (Activation) (None, 256, 16, 16) 0 block_3a_bn_1[0][0]
__________________________________________________________________________________________________
block_3a_conv_2 (Conv2D) (None, 256, 16, 16) 590080 block_3a_relu_1[0][0]
__________________________________________________________________________________________________
block_3a_conv_shortcut (Conv2D) (None, 256, 16, 16) 33024 block_2b_relu[0][0]
__________________________________________________________________________________________________
block_3a_bn_2 (BatchNormalizati (None, 256, 16, 16) 1024 block_3a_conv_2[0][0]
__________________________________________________________________________________________________
block_3a_bn_shortcut (BatchNorm (None, 256, 16, 16) 1024 block_3a_conv_shortcut[0][0]
__________________________________________________________________________________________________
add_5 (Add) (None, 256, 16, 16) 0 block_3a_bn_2[0][0]
block_3a_bn_shortcut[0][0]
__________________________________________________________________________________________________
block_3a_relu (Activation) (None, 256, 16, 16) 0 add_5[0][0]
__________________________________________________________________________________________________
block_3b_conv_1 (Conv2D) (None, 256, 16, 16) 590080 block_3a_relu[0][0]
__________________________________________________________________________________________________
block_3b_bn_1 (BatchNormalizati (None, 256, 16, 16) 1024 block_3b_conv_1[0][0]
__________________________________________________________________________________________________
block_3b_relu_1 (Activation) (None, 256, 16, 16) 0 block_3b_bn_1[0][0]
__________________________________________________________________________________________________
block_3b_conv_2 (Conv2D) (None, 256, 16, 16) 590080 block_3b_relu_1[0][0]
__________________________________________________________________________________________________
block_3b_bn_2 (BatchNormalizati (None, 256, 16, 16) 1024 block_3b_conv_2[0][0]
__________________________________________________________________________________________________
add_6 (Add) (None, 256, 16, 16) 0 block_3b_bn_2[0][0]
block_3a_relu[0][0]
__________________________________________________________________________________________________
block_3b_relu (Activation) (None, 256, 16, 16) 0 add_6[0][0]
__________________________________________________________________________________________________
block_4a_conv_1 (Conv2D) (None, 512, 16, 16) 1180160 block_3b_relu[0][0]
__________________________________________________________________________________________________
block_4a_bn_1 (BatchNormalizati (None, 512, 16, 16) 2048 block_4a_conv_1[0][0]
__________________________________________________________________________________________________
block_4a_relu_1 (Activation) (None, 512, 16, 16) 0 block_4a_bn_1[0][0]
__________________________________________________________________________________________________
block_4a_conv_2 (Conv2D) (None, 512, 16, 16) 2359808 block_4a_relu_1[0][0]
__________________________________________________________________________________________________
block_4a_conv_shortcut (Conv2D) (None, 512, 16, 16) 131584 block_3b_relu[0][0]
__________________________________________________________________________________________________
block_4a_bn_2 (BatchNormalizati (None, 512, 16, 16) 2048 block_4a_conv_2[0][0]
__________________________________________________________________________________________________
block_4a_bn_shortcut (BatchNorm (None, 512, 16, 16) 2048 block_4a_conv_shortcut[0][0]
__________________________________________________________________________________________________
add_7 (Add) (None, 512, 16, 16) 0 block_4a_bn_2[0][0]
block_4a_bn_shortcut[0][0]
__________________________________________________________________________________________________
block_4a_relu (Activation) (None, 512, 16, 16) 0 add_7[0][0]
__________________________________________________________________________________________________
block_4b_conv_1 (Conv2D) (None, 512, 16, 16) 2359808 block_4a_relu[0][0]
__________________________________________________________________________________________________
block_4b_bn_1 (BatchNormalizati (None, 512, 16, 16) 2048 block_4b_conv_1[0][0]
__________________________________________________________________________________________________
block_4b_relu_1 (Activation) (None, 512, 16, 16) 0 block_4b_bn_1[0][0]
__________________________________________________________________________________________________
block_4b_conv_2 (Conv2D) (None, 512, 16, 16) 2359808 block_4b_relu_1[0][0]
__________________________________________________________________________________________________
block_4b_bn_2 (BatchNormalizati (None, 512, 16, 16) 2048 block_4b_conv_2[0][0]
__________________________________________________________________________________________________
add_8 (Add) (None, 512, 16, 16) 0 block_4b_bn_2[0][0]
block_4a_relu[0][0]
__________________________________________________________________________________________________
block_4b_relu (Activation) (None, 512, 16, 16) 0 add_8[0][0]
__________________________________________________________________________________________________
output_bbox (Conv2D) (None, 4, 16, 16) 2052 block_4b_relu[0][0]
__________________________________________________________________________________________________
output_cov (Conv2D) (None, 1, 16, 16) 513 block_4b_relu[0][0]
==================================================================================================
Total params: 11,197,893
Trainable params: 11,188,165
Non-trainable params: 9,728
__________________________________________________________________________________________________
2023-08-29 20:09:04,672 [TAO Toolkit] [INFO] root 2102: DetectNet V2 model built.
2023-08-29 20:09:04,672 [TAO Toolkit] [INFO] root 2102: Building rasterizer.
2023-08-29 20:09:04,673 [TAO Toolkit] [INFO] root 2102: Rasterizers built.
WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/training/training_proto_utilities.py:102: The name tf.train.get_or_create_global_step is deprecated. Please use tf.compat.v1.train.get_or_create_global_step instead.
2023-08-29 20:09:04,673 [TAO Toolkit] [WARNING] tensorflow 137: From /usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/training/training_proto_utilities.py:102: The name tf.train.get_or_create_global_step is deprecated. Please use tf.compat.v1.train.get_or_create_global_step instead.
WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/scripts/train.py:718: The name tf.summary.scalar is deprecated. Please use tf.compat.v1.summary.scalar instead.
2023-08-29 20:09:04,683 [TAO Toolkit] [WARNING] tensorflow 137: From /usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/scripts/train.py:718: The name tf.summary.scalar is deprecated. Please use tf.compat.v1.summary.scalar instead.
2023-08-29 20:09:04,683 [TAO Toolkit] [INFO] root 2102: Building training graph.
2023-08-29 20:09:04,684 [TAO Toolkit] [INFO] nvidia_tao_tf1.blocks.multi_source_loader.data_loader 175: Serial augmentation enabled = False
2023-08-29 20:09:04,684 [TAO Toolkit] [INFO] nvidia_tao_tf1.blocks.multi_source_loader.data_loader 177: Pseudo sharding enabled = False
2023-08-29 20:09:04,685 [TAO Toolkit] [INFO] nvidia_tao_tf1.blocks.multi_source_loader.data_loader 269: Max Image Dimensions (all sources): (0, 0)
2023-08-29 20:09:04,685 [TAO Toolkit] [INFO] nvidia_tao_tf1.blocks.multi_source_loader.data_loader 380: number of cpus: 16, io threads: 32, compute threads: 16, buffered batches: 4
2023-08-29 20:09:04,685 [TAO Toolkit] [INFO] nvidia_tao_tf1.blocks.multi_source_loader.data_loader 387: total dataset size 64, number of sources: 1, batch size per gpu: 4, steps: 16
WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/tensorflow_core/python/autograph/converters/directives.py:119: The name tf.set_random_seed is deprecated. Please use tf.compat.v1.set_random_seed instead.
2023-08-29 20:09:04,710 [TAO Toolkit] [WARNING] tensorflow 137: From /usr/local/lib/python3.8/dist-packages/tensorflow_core/python/autograph/converters/directives.py:119: The name tf.set_random_seed is deprecated. Please use tf.compat.v1.set_random_seed instead.
2023-08-29 20:09:06,345 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.dataloader.default_dataloader 546: Bounding box coordinates were detected in the input specification! Bboxes will be automatically converted to polygon coordinates.
2023-08-29 20:09:08,311 [TAO Toolkit] [INFO] nvidia_tao_tf1.blocks.multi_source_loader.data_loader 409: shuffle: True - shard 0 of 1
2023-08-29 20:09:08,314 [TAO Toolkit] [INFO] nvidia_tao_tf1.blocks.multi_source_loader.data_loader 479: sampling 1 datasets with weights:
2023-08-29 20:09:08,314 [TAO Toolkit] [INFO] nvidia_tao_tf1.blocks.multi_source_loader.data_loader 481: source: 0 weight: 1.000000
WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/tensorflow_core/python/autograph/converters/directives.py:119: The name tf.image.resize_images is deprecated. Please use tf.image.resize instead.
2023-08-29 20:09:08,871 [TAO Toolkit] [WARNING] tensorflow 137: From /usr/local/lib/python3.8/dist-packages/tensorflow_core/python/autograph/converters/directives.py:119: The name tf.image.resize_images is deprecated. Please use tf.image.resize instead.
2023-08-29 20:09:09,410 [TAO Toolkit] [INFO] __main__ 536: Found 64 samples in training set
WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/common/visualizer/tensorboard_visualizer.py:92: The name tf.summary.image is deprecated. Please use tf.compat.v1.summary.image instead.
2023-08-29 20:09:09,412 [TAO Toolkit] [WARNING] tensorflow 137: From /usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/common/visualizer/tensorboard_visualizer.py:92: The name tf.summary.image is deprecated. Please use tf.compat.v1.summary.image instead.
2023-08-29 20:09:09,413 [TAO Toolkit] [INFO] root 2102: Rasterizing tensors.
WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/rasterizers/bbox_rasterizer.py:348: The name tf.bincount is deprecated. Please use tf.math.bincount instead.
2023-08-29 20:09:09,479 [TAO Toolkit] [WARNING] tensorflow 137: From /usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/rasterizers/bbox_rasterizer.py:348: The name tf.bincount is deprecated. Please use tf.math.bincount instead.
2023-08-29 20:09:09,549 [TAO Toolkit] [INFO] root 2102: Tensors rasterized.
WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/training/training_proto_utilities.py:49: The name tf.train.AdamOptimizer is deprecated. Please use tf.compat.v1.train.AdamOptimizer instead.
2023-08-29 20:09:09,549 [TAO Toolkit] [WARNING] tensorflow 137: From /usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/training/training_proto_utilities.py:49: The name tf.train.AdamOptimizer is deprecated. Please use tf.compat.v1.train.AdamOptimizer instead.
WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/cost_function/cost_functions.py:29: The name tf.log is deprecated. Please use tf.math.log instead.
2023-08-29 20:09:09,658 [TAO Toolkit] [WARNING] tensorflow 137: From /usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/cost_function/cost_functions.py:29: The name tf.log is deprecated. Please use tf.math.log instead.
WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/cost_function/cost_auto_weight_hook.py:250: The name tf.assign_add is deprecated. Please use tf.compat.v1.assign_add instead.
2023-08-29 20:09:09,679 [TAO Toolkit] [WARNING] tensorflow 137: From /usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/cost_function/cost_auto_weight_hook.py:250: The name tf.assign_add is deprecated. Please use tf.compat.v1.assign_add instead.
WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/common/visualizer/tensorboard_visualizer.py:99: The name tf.summary.histogram is deprecated. Please use tf.compat.v1.summary.histogram instead.
2023-08-29 20:09:10,786 [TAO Toolkit] [WARNING] tensorflow 137: From /usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/common/visualizer/tensorboard_visualizer.py:99: The name tf.summary.histogram is deprecated. Please use tf.compat.v1.summary.histogram instead.
2023-08-29 20:09:10,947 [TAO Toolkit] [INFO] root 2102: Training graph built.
2023-08-29 20:09:10,948 [TAO Toolkit] [INFO] root 2102: Building validation graph.
2023-08-29 20:09:10,949 [TAO Toolkit] [INFO] nvidia_tao_tf1.blocks.multi_source_loader.data_loader 175: Serial augmentation enabled = False
2023-08-29 20:09:10,949 [TAO Toolkit] [INFO] nvidia_tao_tf1.blocks.multi_source_loader.data_loader 177: Pseudo sharding enabled = False
2023-08-29 20:09:10,949 [TAO Toolkit] [INFO] nvidia_tao_tf1.blocks.multi_source_loader.data_loader 269: Max Image Dimensions (all sources): (0, 0)
2023-08-29 20:09:10,949 [TAO Toolkit] [INFO] nvidia_tao_tf1.blocks.multi_source_loader.data_loader 380: number of cpus: 16, io threads: 32, compute threads: 16, buffered batches: 4
2023-08-29 20:09:10,949 [TAO Toolkit] [INFO] nvidia_tao_tf1.blocks.multi_source_loader.data_loader 387: total dataset size 10, number of sources: 1, batch size per gpu: 4, steps: 3
2023-08-29 20:09:10,973 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.dataloader.default_dataloader 546: Bounding box coordinates were detected in the input specification! Bboxes will be automatically converted to polygon coordinates.
2023-08-29 20:09:11,125 [TAO Toolkit] [INFO] nvidia_tao_tf1.blocks.multi_source_loader.data_loader 409: shuffle: False - shard 0 of 1
2023-08-29 20:09:11,128 [TAO Toolkit] [INFO] nvidia_tao_tf1.blocks.multi_source_loader.data_loader 479: sampling 1 datasets with weights:
2023-08-29 20:09:11,128 [TAO Toolkit] [INFO] nvidia_tao_tf1.blocks.multi_source_loader.data_loader 481: source: 0 weight: 1.000000
2023-08-29 20:09:11,276 [TAO Toolkit] [INFO] __main__ 591: Found 10 samples in validation set
2023-08-29 20:09:11,276 [TAO Toolkit] [INFO] root 2102: Rasterizing tensors.
2023-08-29 20:09:11,396 [TAO Toolkit] [INFO] root 2102: Tensors rasterized.
2023-08-29 20:09:11,576 [TAO Toolkit] [INFO] root 2102: Validation graph built.
WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/tfhooks/validation_hook.py:58: The name tf.summary.FileWriterCache is deprecated. Please use tf.compat.v1.summary.FileWriterCache instead.
2023-08-29 20:09:11,576 [TAO Toolkit] [WARNING] tensorflow 137: From /usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/tfhooks/validation_hook.py:58: The name tf.summary.FileWriterCache is deprecated. Please use tf.compat.v1.summary.FileWriterCache instead.
2023-08-29 20:09:12,305 [TAO Toolkit] [INFO] root 2102: Running training loop.
2023-08-29 20:09:12,306 [TAO Toolkit] [INFO] __main__ 135: Checkpoint interval: 10
2023-08-29 20:09:12,306 [TAO Toolkit] [INFO] root 2102: Number of logging points 50 must be <= than the number of steps per epoch 16.
2023-08-29 20:09:12,306 [TAO Toolkit] [INFO] root 2102: Number of logging points 50 must be <= than the number of steps per epoch 16.
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/scripts/train.py", line 1067, in <module>
raise e
File "/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/scripts/train.py", line 1046, in <module>
main()
File "/usr/local/lib/python3.8/dist-packages/decorator.py", line 232, in fun
return caller(func, *(extras + args), **kw)
File "/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/utilities/timer.py", line 46, in wrapped_fn
return_args = fn(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/scripts/train.py", line 1024, in main
run_experiment(
File "/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/scripts/train.py", line 887, in run_experiment
train_gridbox(results_dir, experiment_spec, output_model_file_name, input_model_file_name,
File "/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/scripts/train.py", line 760, in train_gridbox
run_training_loop(experiment_spec, results_dir, gridbox_model, hooks, steps_per_epoch,
File "/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/scripts/train.py", line 151, in run_training_loop
raise ValueError(validation_message)
ValueError: Number of logging points 50 must be <= than the number of steps per epoch 16.
Execution status: FAIL
2023-08-29 16:09:16,115 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 337: Stopping container.
Here is my training spec file:
random_seed: 42
dataset_config {
data_sources {
tfrecords_path: "/workspace/tao-experiments/data/tfrecords/kitti_trainval/*"
image_directory_path: "/workspace/tao-experiments/data/training"
}
image_extension: "png"
target_class_mapping {
key: "logo"
value: "logo"
}
validation_fold: 0
}
augmentation_config {
preprocessing {
output_image_width: 256
output_image_height: 256
min_bbox_width: 1.0
min_bbox_height: 1.0
output_image_channel: 3
}
spatial_augmentation {
hflip_probability: 0.5
zoom_min: 1.0
zoom_max: 1.0
translate_max_x: 8.0
translate_max_y: 8.0
}
color_augmentation {
hue_rotation_max: 25.0
saturation_shift_max: 0.20000000298
contrast_scale_max: 0.10000000149
contrast_center: 0.5
}
}
postprocessing_config {
target_class_config {
key: "logo"
value {
clustering_config {
clustering_algorithm: DBSCAN
dbscan_confidence_threshold: 0.9
coverage_threshold: 0.00499999988824
dbscan_eps: 0.20000000298
dbscan_min_samples: 1
minimum_bounding_box_height: 20
}
}
}
}
model_config {
pretrained_model_file: "/workspace/tao-experiments/detectnet_v2/pretrained_resnet18/pretrained_detectnet_v2_vresnet18/resnet18.hdf5"
num_layers: 18
use_batch_norm: true
objective_set {
bbox {
scale: 35.0
offset: 0.5
}
cov {
}
}
arch: "resnet"
}
evaluation_config {
validation_period_during_training: 10
first_validation_epoch: 30
minimum_detection_ground_truth_overlap {
key: "logo"
value: 0.699999988079
}
evaluation_box_config {
key: "logo"
value {
minimum_height: 20
maximum_height: 9999
minimum_width: 10
maximum_width: 9999
}
}
average_precision_mode: INTEGRATE
}
cost_function_config {
target_classes {
name: "logo"
class_weight: 1.0
coverage_foreground_weight: 0.0500000007451
objectives {
name: "cov"
initial_weight: 1.0
weight_target: 1.0
}
objectives {
name: "bbox"
initial_weight: 10.0
weight_target: 10.0
}
}
enable_autoweighting: false
max_objective_weight: 0.999899983406
min_objective_weight: 9.99999974738e-05
}
training_config {
batch_size_per_gpu: 4
num_epochs: 120
learning_rate {
soft_start_annealing_schedule {
min_learning_rate: 5e-07
max_learning_rate: 5e-05
soft_start: 0.10000000149
annealing: 0.699999988079
}
}
regularizer {
type: L1
weight: 3.00000002618e-09
}
optimizer {
adam {
epsilon: 9.99999993923e-09
beta1: 0.899999976158
beta2: 0.999000012875
}
}
cost_scaling {
initial_exponent: 20.0
increment: 0.005
decrement: 1.0
}
visualizer{
enabled: true
num_images: 3
scalar_logging_frequency: 50
infrequent_logging_frequency: 5
target_class_config {
key: "logo"
value: {
coverage_threshold: 0.005
}
}
clearml_config{
project: "TAO Toolkit ClearML Demo"
task: "detectnet_v2_resnet18_clearml"
tags: "detectnet_v2"
tags: "training"
tags: "resnet18"
tags: "unpruned"
}
wandb_config{
project: "TAO Toolkit Wandb Demo"
name: "detectnet_v2_resnet18_wandb"
tags: "detectnet_v2"
tags: "training"
tags: "resnet18"
tags: "unpruned"
}
}
checkpoint_interval: 10
}
bbox_rasterizer_config {
target_class_config {
key: "logo"
value {
cov_center_x: 0.5
cov_center_y: 0.5
cov_radius_x: 0.40000000596
cov_radius_y: 0.40000000596
bbox_min_radius: 1.0
}
}
deadzone_radius: 0.400000154972
}
The TFrecords converter appears to successfully process my dataset, here is the output:
Converting Tfrecords for kitti trainval dataset
2023-08-29 15:39:18,268 [TAO Toolkit] [INFO] root 160: Registry: ['nvcr.io']
2023-08-29 15:39:18,306 [TAO Toolkit] [INFO] nvidia_tao_cli.components.instance_handler.local_instance 360: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:5.0.0-tf1.15.5
2023-08-29 15:39:18,328 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 275: Printing tty value True
2023-08-29 19:39:18.980331: I tensorflow/stream_executor/platform/default/dso_loader.cc:50] Successfully opened dynamic library libcudart.so.12
2023-08-29 19:39:19,009 [TAO Toolkit] [WARNING] tensorflow 40: Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
Using TensorFlow backend.
2023-08-29 19:39:20,037 [TAO Toolkit] [WARNING] tensorflow 43: TensorFlow will not use sklearn by default. This improves performance in some cases. To enable sklearn export the environment variable TF_ALLOW_IOLIBS=1.
2023-08-29 19:39:20,061 [TAO Toolkit] [WARNING] tensorflow 42: TensorFlow will not use Dask by default. This improves performance in some cases. To enable Dask export the environment variable TF_ALLOW_IOLIBS=1.
2023-08-29 19:39:20,063 [TAO Toolkit] [WARNING] tensorflow 43: TensorFlow will not use Pandas by default. This improves performance in some cases. To enable Pandas export the environment variable TF_ALLOW_IOLIBS=1.
2023-08-29 19:39:21,047 [TAO Toolkit] [WARNING] matplotlib 500: Matplotlib created a temporary config/cache directory at /tmp/matplotlib-8x_hsjft because the default path (/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
2023-08-29 19:39:21,200 [TAO Toolkit] [INFO] matplotlib.font_manager 1633: generated new fontManager
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
Using TensorFlow backend.
WARNING:tensorflow:TensorFlow will not use sklearn by default. This improves performance in some cases. To enable sklearn export the environment variable TF_ALLOW_IOLIBS=1.
2023-08-29 19:39:22,262 [TAO Toolkit] [WARNING] tensorflow 43: TensorFlow will not use sklearn by default. This improves performance in some cases. To enable sklearn export the environment variable TF_ALLOW_IOLIBS=1.
WARNING:tensorflow:TensorFlow will not use Dask by default. This improves performance in some cases. To enable Dask export the environment variable TF_ALLOW_IOLIBS=1.
2023-08-29 19:39:22,283 [TAO Toolkit] [WARNING] tensorflow 42: TensorFlow will not use Dask by default. This improves performance in some cases. To enable Dask export the environment variable TF_ALLOW_IOLIBS=1.
WARNING:tensorflow:TensorFlow will not use Pandas by default. This improves performance in some cases. To enable Pandas export the environment variable TF_ALLOW_IOLIBS=1.
2023-08-29 19:39:22,285 [TAO Toolkit] [WARNING] tensorflow 43: TensorFlow will not use Pandas by default. This improves performance in some cases. To enable Pandas export the environment variable TF_ALLOW_IOLIBS=1.
2023-08-29 19:39:22,611 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.common.logging.logging 197: Log file already exists at /workspace/tao-experiments/detectnet_v2/status.json
2023-08-29 19:39:22,611 [TAO Toolkit] [INFO] root 2102: Starting Object Detection Dataset Convert.
2023-08-29 19:39:22,611 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.dataio.build_converter 87: Instantiating a kitti converter
2023-08-29 19:39:22,611 [TAO Toolkit] [INFO] root 2102: Instantiating a kitti converter
2023-08-29 19:39:22,611 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.dataio.dataset_converter_lib 71: Creating output directory /workspace/tao-experiments/data/tfrecords/kitti_trainval
2023-08-29 19:39:22,611 [TAO Toolkit] [INFO] root 2102: Generating partitions
2023-08-29 19:39:22,612 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.dataio.kitti_converter_lib 176: Num images in
Train: 64 Val: 10
2023-08-29 19:39:22,612 [TAO Toolkit] [INFO] root 2102: Num images in
Train: 64 Val: 10
2023-08-29 19:39:22,612 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.dataio.kitti_converter_lib 197: Validation data in partition 0. Hence, while choosing the validationset during training choose validation_fold 0.
2023-08-29 19:39:22,612 [TAO Toolkit] [INFO] root 2102: Validation data in partition 0. Hence, while choosing the validationset during training choose validation_fold 0.
2023-08-29 19:39:22,612 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.dataio.dataset_converter_lib 166: Writing partition 0, shard 0
2023-08-29 19:39:22,612 [TAO Toolkit] [INFO] root 2102: Writing partition 0, shard 0
WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/dataio/dataset_converter_lib.py:181: The name tf.python_io.TFRecordWriter is deprecated. Please use tf.io.TFRecordWriter instead.
2023-08-29 19:39:22,612 [TAO Toolkit] [WARNING] tensorflow 137: From /usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/dataio/dataset_converter_lib.py:181: The name tf.python_io.TFRecordWriter is deprecated. Please use tf.io.TFRecordWriter instead.
2023-08-29 19:39:22,614 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.dataio.dataset_converter_lib 166: Writing partition 0, shard 1
2023-08-29 19:39:22,614 [TAO Toolkit] [INFO] root 2102: Writing partition 0, shard 1
2023-08-29 19:39:22,615 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.dataio.dataset_converter_lib 166: Writing partition 0, shard 2
2023-08-29 19:39:22,615 [TAO Toolkit] [INFO] root 2102: Writing partition 0, shard 2
2023-08-29 19:39:22,615 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.dataio.dataset_converter_lib 166: Writing partition 0, shard 3
2023-08-29 19:39:22,615 [TAO Toolkit] [INFO] root 2102: Writing partition 0, shard 3
2023-08-29 19:39:22,615 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.dataio.dataset_converter_lib 166: Writing partition 0, shard 4
2023-08-29 19:39:22,615 [TAO Toolkit] [INFO] root 2102: Writing partition 0, shard 4
2023-08-29 19:39:22,616 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.dataio.dataset_converter_lib 166: Writing partition 0, shard 5
2023-08-29 19:39:22,616 [TAO Toolkit] [INFO] root 2102: Writing partition 0, shard 5
2023-08-29 19:39:22,616 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.dataio.dataset_converter_lib 166: Writing partition 0, shard 6
2023-08-29 19:39:22,616 [TAO Toolkit] [INFO] root 2102: Writing partition 0, shard 6
2023-08-29 19:39:22,617 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.dataio.dataset_converter_lib 166: Writing partition 0, shard 7
2023-08-29 19:39:22,617 [TAO Toolkit] [INFO] root 2102: Writing partition 0, shard 7
2023-08-29 19:39:22,617 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.dataio.dataset_converter_lib 166: Writing partition 0, shard 8
2023-08-29 19:39:22,617 [TAO Toolkit] [INFO] root 2102: Writing partition 0, shard 8
2023-08-29 19:39:22,617 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.dataio.dataset_converter_lib 166: Writing partition 0, shard 9
2023-08-29 19:39:22,617 [TAO Toolkit] [INFO] root 2102: Writing partition 0, shard 9
2023-08-29 19:39:22,618 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.dataio.dataset_converter_lib 250:
Wrote the following numbers of objects:
b'logo': 10
2023-08-29 19:39:22,618 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.dataio.dataset_converter_lib 166: Writing partition 1, shard 0
2023-08-29 19:39:22,618 [TAO Toolkit] [INFO] root 2102: Writing partition 1, shard 0
2023-08-29 19:39:22,619 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.dataio.dataset_converter_lib 166: Writing partition 1, shard 1
2023-08-29 19:39:22,619 [TAO Toolkit] [INFO] root 2102: Writing partition 1, shard 1
2023-08-29 19:39:22,621 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.dataio.dataset_converter_lib 166: Writing partition 1, shard 2
2023-08-29 19:39:22,621 [TAO Toolkit] [INFO] root 2102: Writing partition 1, shard 2
2023-08-29 19:39:22,622 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.dataio.dataset_converter_lib 166: Writing partition 1, shard 3
2023-08-29 19:39:22,622 [TAO Toolkit] [INFO] root 2102: Writing partition 1, shard 3
2023-08-29 19:39:22,624 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.dataio.dataset_converter_lib 166: Writing partition 1, shard 4
2023-08-29 19:39:22,624 [TAO Toolkit] [INFO] root 2102: Writing partition 1, shard 4
2023-08-29 19:39:22,625 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.dataio.dataset_converter_lib 166: Writing partition 1, shard 5
2023-08-29 19:39:22,625 [TAO Toolkit] [INFO] root 2102: Writing partition 1, shard 5
2023-08-29 19:39:22,626 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.dataio.dataset_converter_lib 166: Writing partition 1, shard 6
2023-08-29 19:39:22,626 [TAO Toolkit] [INFO] root 2102: Writing partition 1, shard 6
2023-08-29 19:39:22,628 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.dataio.dataset_converter_lib 166: Writing partition 1, shard 7
2023-08-29 19:39:22,628 [TAO Toolkit] [INFO] root 2102: Writing partition 1, shard 7
2023-08-29 19:39:22,629 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.dataio.dataset_converter_lib 166: Writing partition 1, shard 8
2023-08-29 19:39:22,629 [TAO Toolkit] [INFO] root 2102: Writing partition 1, shard 8
2023-08-29 19:39:22,630 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.dataio.dataset_converter_lib 166: Writing partition 1, shard 9
2023-08-29 19:39:22,631 [TAO Toolkit] [INFO] root 2102: Writing partition 1, shard 9
2023-08-29 19:39:22,633 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.dataio.dataset_converter_lib 250:
Wrote the following numbers of objects:
b'logo': 65
2023-08-29 19:39:22,633 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.dataio.dataset_converter_lib 89: Cumulative object statistics
2023-08-29 19:39:22,633 [TAO Toolkit] [INFO] root 2102: Cumulative object statistics
2023-08-29 19:39:22,633 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.dataio.dataset_converter_lib 250:
Wrote the following numbers of objects:
b'logo': 75
2023-08-29 19:39:22,633 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.dataio.dataset_converter_lib 105: Class map.
Label in GT: Label in tfrecords file
b'logo': b'logo'
2023-08-29 19:39:22,633 [TAO Toolkit] [INFO] root 2102: Class map.
Label in GT: Label in tfrecords file
b'logo': b'logo'
For the dataset_config in the experiment_spec, please use labels in the tfrecords file, while writing the classmap.
2023-08-29 19:39:22,633 [TAO Toolkit] [INFO] root 2102: For the dataset_config in the experiment_spec, please use labels in the tfrecords file, while writing the classmap.
2023-08-29 19:39:22,633 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.detectnet_v2.dataio.dataset_converter_lib 114: Tfrecords generation complete.
2023-08-29 19:39:22,633 [TAO Toolkit] [INFO] root 2102: TFRecords generation complete.
2023-08-29 19:39:22,633 [TAO Toolkit] [INFO] root 2102: Dataset convert finished successfully.
Execution status: PASS
2023-08-29 15:39:25,785 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 337: Stopping container.