Docker instantiation failed with error: 500 Server Error: Internal Server Error ("OCI runtime create failed...)

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:517: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

2021-06-12 13:03:33,305 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:517: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4138: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.

2021-06-12 13:03:33,309 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4138: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:1834: The name tf.nn.fused_batch_norm is deprecated. Please use tf.compat.v1.nn.fused_batch_norm instead.

2021-06-12 13:03:33,375 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:1834: The name tf.nn.fused_batch_norm is deprecated. Please use tf.compat.v1.nn.fused_batch_norm instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4185: The name tf.truncated_normal is deprecated. Please use tf.random.truncated_normal instead.

2021-06-12 13:03:34,909 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4185: The name tf.truncated_normal is deprecated. Please use tf.random.truncated_normal instead.

Initialize optimizer
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:174: The name tf.get_default_session is deprecated. Please use tf.compat.v1.get_default_session instead.

2021-06-12 13:03:38,041 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:174: The name tf.get_default_session is deprecated. Please use tf.compat.v1.get_default_session instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:199: The name tf.is_variable_initialized is deprecated. Please use tf.compat.v1.is_variable_initialized instead.

2021-06-12 13:03:38,042 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:199: The name tf.is_variable_initialized is deprecated. Please use tf.compat.v1.is_variable_initialized instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:206: The name tf.variables_initializer is deprecated. Please use tf.compat.v1.variables_initializer instead.

2021-06-12 13:03:39,009 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:206: The name tf.variables_initializer is deprecated. Please use tf.compat.v1.variables_initializer instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/optimizers.py:790: The name tf.train.Optimizer is deprecated. Please use tf.compat.v1.train.Optimizer instead.

2021-06-12 13:03:40,113 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/optimizers.py:790: The name tf.train.Optimizer is deprecated. Please use tf.compat.v1.train.Optimizer instead.

WARNING:tensorflow:From /home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/ssd/architecture/ssd_loss.py:87: The name tf.log is deprecated. Please use tf.math.log instead.

2021-06-12 13:03:40,142 [WARNING] tensorflow: From /home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/ssd/architecture/ssd_loss.py:87: The name tf.log is deprecated. Please use tf.math.log instead.

WARNING:tensorflow:From /home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/ssd/utils/tensor_utils.py:121: The name tf.local_variables_initializer is deprecated. Please use tf.compat.v1.local_variables_initializer instead.

2021-06-12 13:03:40,270 [WARNING] tensorflow: From /home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/ssd/utils/tensor_utils.py:121: The name tf.local_variables_initializer is deprecated. Please use tf.compat.v1.local_variables_initializer instead.

WARNING:tensorflow:From /home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/ssd/utils/tensor_utils.py:122: The name tf.tables_initializer is deprecated. Please use tf.compat.v1.tables_initializer instead.

2021-06-12 13:03:40,271 [WARNING] tensorflow: From /home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/ssd/utils/tensor_utils.py:122: The name tf.tables_initializer is deprecated. Please use tf.compat.v1.tables_initializer instead.

WARNING:tensorflow:From /home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/ssd/utils/tensor_utils.py:123: The name tf.get_collection is deprecated. Please use tf.compat.v1.get_collection instead.

2021-06-12 13:03:40,271 [WARNING] tensorflow: From /home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/ssd/utils/tensor_utils.py:123: The name tf.get_collection is deprecated. Please use tf.compat.v1.get_collection instead.

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to
==================================================================================================
Input (InputLayer)              (None, 3, 300, 300)  0
__________________________________________________________________________________________________
conv1 (Conv2D)                  (None, 64, 150, 150) 9408        Input[0][0]
__________________________________________________________________________________________________
bn_conv1 (BatchNormalization)   (None, 64, 150, 150) 256         conv1[0][0]
__________________________________________________________________________________________________
activation_1 (Activation)       (None, 64, 150, 150) 0           bn_conv1[0][0]
__________________________________________________________________________________________________
block_1a_conv_1 (Conv2D)        (None, 64, 75, 75)   36864       activation_1[0][0]
__________________________________________________________________________________________________
block_1a_bn_1 (BatchNormalizati (None, 64, 75, 75)   256         block_1a_conv_1[0][0]
__________________________________________________________________________________________________
block_1a_relu_1 (Activation)    (None, 64, 75, 75)   0           block_1a_bn_1[0][0]
__________________________________________________________________________________________________
block_1a_conv_2 (Conv2D)        (None, 64, 75, 75)   36864       block_1a_relu_1[0][0]
__________________________________________________________________________________________________
block_1a_conv_shortcut (Conv2D) (None, 64, 75, 75)   4096        activation_1[0][0]
__________________________________________________________________________________________________
block_1a_bn_2 (BatchNormalizati (None, 64, 75, 75)   256         block_1a_conv_2[0][0]
__________________________________________________________________________________________________
block_1a_bn_shortcut (BatchNorm (None, 64, 75, 75)   256         block_1a_conv_shortcut[0][0]
__________________________________________________________________________________________________
add_1 (Add)                     (None, 64, 75, 75)   0           block_1a_bn_2[0][0]
                                                                 block_1a_bn_shortcut[0][0]
__________________________________________________________________________________________________
block_1a_relu (Activation)      (None, 64, 75, 75)   0           add_1[0][0]
__________________________________________________________________________________________________
block_1b_conv_1 (Conv2D)        (None, 64, 75, 75)   36864       block_1a_relu[0][0]
__________________________________________________________________________________________________
block_1b_bn_1 (BatchNormalizati (None, 64, 75, 75)   256         block_1b_conv_1[0][0]
__________________________________________________________________________________________________
block_1b_relu_1 (Activation)    (None, 64, 75, 75)   0           block_1b_bn_1[0][0]
__________________________________________________________________________________________________
block_1b_conv_2 (Conv2D)        (None, 64, 75, 75)   36864       block_1b_relu_1[0][0]
__________________________________________________________________________________________________
block_1b_conv_shortcut (Conv2D) (None, 64, 75, 75)   4096        block_1a_relu[0][0]
__________________________________________________________________________________________________
block_1b_bn_2 (BatchNormalizati (None, 64, 75, 75)   256         block_1b_conv_2[0][0]
__________________________________________________________________________________________________
block_1b_bn_shortcut (BatchNorm (None, 64, 75, 75)   256         block_1b_conv_shortcut[0][0]
__________________________________________________________________________________________________
add_2 (Add)                     (None, 64, 75, 75)   0           block_1b_bn_2[0][0]
                                                                 block_1b_bn_shortcut[0][0]
__________________________________________________________________________________________________
block_1b_relu (Activation)      (None, 64, 75, 75)   0           add_2[0][0]
__________________________________________________________________________________________________
block_2a_conv_1 (Conv2D)        (None, 128, 38, 38)  73728       block_1b_relu[0][0]
__________________________________________________________________________________________________
block_2a_bn_1 (BatchNormalizati (None, 128, 38, 38)  512         block_2a_conv_1[0][0]
__________________________________________________________________________________________________
block_2a_relu_1 (Activation)    (None, 128, 38, 38)  0           block_2a_bn_1[0][0]
__________________________________________________________________________________________________
block_2a_conv_2 (Conv2D)        (None, 128, 38, 38)  147456      block_2a_relu_1[0][0]
__________________________________________________________________________________________________
block_2a_conv_shortcut (Conv2D) (None, 128, 38, 38)  8192        block_1b_relu[0][0]
__________________________________________________________________________________________________
block_2a_bn_2 (BatchNormalizati (None, 128, 38, 38)  512         block_2a_conv_2[0][0]
__________________________________________________________________________________________________
block_2a_bn_shortcut (BatchNorm (None, 128, 38, 38)  512         block_2a_conv_shortcut[0][0]
__________________________________________________________________________________________________
add_3 (Add)                     (None, 128, 38, 38)  0           block_2a_bn_2[0][0]
                                                                 block_2a_bn_shortcut[0][0]
__________________________________________________________________________________________________
block_2a_relu (Activation)      (None, 128, 38, 38)  0           add_3[0][0]
__________________________________________________________________________________________________
block_2b_conv_1 (Conv2D)        (None, 128, 38, 38)  147456      block_2a_relu[0][0]
__________________________________________________________________________________________________
block_2b_bn_1 (BatchNormalizati (None, 128, 38, 38)  512         block_2b_conv_1[0][0]
__________________________________________________________________________________________________
block_2b_relu_1 (Activation)    (None, 128, 38, 38)  0           block_2b_bn_1[0][0]
__________________________________________________________________________________________________
block_2b_conv_2 (Conv2D)        (None, 128, 38, 38)  147456      block_2b_relu_1[0][0]
__________________________________________________________________________________________________
block_2b_conv_shortcut (Conv2D) (None, 128, 38, 38)  16384       block_2a_relu[0][0]
__________________________________________________________________________________________________
block_2b_bn_2 (BatchNormalizati (None, 128, 38, 38)  512         block_2b_conv_2[0][0]
__________________________________________________________________________________________________
block_2b_bn_shortcut (BatchNorm (None, 128, 38, 38)  512         block_2b_conv_shortcut[0][0]
__________________________________________________________________________________________________
add_4 (Add)                     (None, 128, 38, 38)  0           block_2b_bn_2[0][0]
                                                                 block_2b_bn_shortcut[0][0]
__________________________________________________________________________________________________
block_2b_relu (Activation)      (None, 128, 38, 38)  0           add_4[0][0]
__________________________________________________________________________________________________
block_3a_conv_1 (Conv2D)        (None, 256, 19, 19)  294912      block_2b_relu[0][0]
__________________________________________________________________________________________________
block_3a_bn_1 (BatchNormalizati (None, 256, 19, 19)  1024        block_3a_conv_1[0][0]
__________________________________________________________________________________________________
block_3a_relu_1 (Activation)    (None, 256, 19, 19)  0           block_3a_bn_1[0][0]
__________________________________________________________________________________________________
block_3a_conv_2 (Conv2D)        (None, 256, 19, 19)  589824      block_3a_relu_1[0][0]
__________________________________________________________________________________________________
block_3a_conv_shortcut (Conv2D) (None, 256, 19, 19)  32768       block_2b_relu[0][0]
__________________________________________________________________________________________________
block_3a_bn_2 (BatchNormalizati (None, 256, 19, 19)  1024        block_3a_conv_2[0][0]
__________________________________________________________________________________________________
block_3a_bn_shortcut (BatchNorm (None, 256, 19, 19)  1024        block_3a_conv_shortcut[0][0]
__________________________________________________________________________________________________
add_5 (Add)                     (None, 256, 19, 19)  0           block_3a_bn_2[0][0]
                                                                 block_3a_bn_shortcut[0][0]
__________________________________________________________________________________________________
block_3a_relu (Activation)      (None, 256, 19, 19)  0           add_5[0][0]
__________________________________________________________________________________________________
block_3b_conv_1 (Conv2D)        (None, 256, 19, 19)  589824      block_3a_relu[0][0]
__________________________________________________________________________________________________
block_3b_bn_1 (BatchNormalizati (None, 256, 19, 19)  1024        block_3b_conv_1[0][0]
__________________________________________________________________________________________________
block_3b_relu_1 (Activation)    (None, 256, 19, 19)  0           block_3b_bn_1[0][0]
__________________________________________________________________________________________________
block_3b_conv_2 (Conv2D)        (None, 256, 19, 19)  589824      block_3b_relu_1[0][0]
__________________________________________________________________________________________________
block_3b_conv_shortcut (Conv2D) (None, 256, 19, 19)  65536       block_3a_relu[0][0]
__________________________________________________________________________________________________
block_3b_bn_2 (BatchNormalizati (None, 256, 19, 19)  1024        block_3b_conv_2[0][0]
__________________________________________________________________________________________________
block_3b_bn_shortcut (BatchNorm (None, 256, 19, 19)  1024        block_3b_conv_shortcut[0][0]
__________________________________________________________________________________________________
add_6 (Add)                     (None, 256, 19, 19)  0           block_3b_bn_2[0][0]
                                                                 block_3b_bn_shortcut[0][0]
__________________________________________________________________________________________________
block_3b_relu (Activation)      (None, 256, 19, 19)  0           add_6[0][0]
__________________________________________________________________________________________________
block_4a_conv_1 (Conv2D)        (None, 512, 19, 19)  1179648     block_3b_relu[0][0]
__________________________________________________________________________________________________
block_4a_bn_1 (BatchNormalizati (None, 512, 19, 19)  2048        block_4a_conv_1[0][0]
__________________________________________________________________________________________________
block_4a_relu_1 (Activation)    (None, 512, 19, 19)  0           block_4a_bn_1[0][0]
__________________________________________________________________________________________________
block_4a_conv_2 (Conv2D)        (None, 512, 19, 19)  2359296     block_4a_relu_1[0][0]
__________________________________________________________________________________________________
block_4a_conv_shortcut (Conv2D) (None, 512, 19, 19)  131072      block_3b_relu[0][0]
__________________________________________________________________________________________________
block_4a_bn_2 (BatchNormalizati (None, 512, 19, 19)  2048        block_4a_conv_2[0][0]
__________________________________________________________________________________________________
block_4a_bn_shortcut (BatchNorm (None, 512, 19, 19)  2048        block_4a_conv_shortcut[0][0]
__________________________________________________________________________________________________
add_7 (Add)                     (None, 512, 19, 19)  0           block_4a_bn_2[0][0]
                                                                 block_4a_bn_shortcut[0][0]
__________________________________________________________________________________________________
block_4a_relu (Activation)      (None, 512, 19, 19)  0           add_7[0][0]
__________________________________________________________________________________________________
block_4b_conv_1 (Conv2D)        (None, 512, 19, 19)  2359296     block_4a_relu[0][0]
__________________________________________________________________________________________________
block_4b_bn_1 (BatchNormalizati (None, 512, 19, 19)  2048        block_4b_conv_1[0][0]
__________________________________________________________________________________________________
block_4b_relu_1 (Activation)    (None, 512, 19, 19)  0           block_4b_bn_1[0][0]
__________________________________________________________________________________________________
block_4b_conv_2 (Conv2D)        (None, 512, 19, 19)  2359296     block_4b_relu_1[0][0]
__________________________________________________________________________________________________
block_4b_conv_shortcut (Conv2D) (None, 512, 19, 19)  262144      block_4a_relu[0][0]
__________________________________________________________________________________________________
block_4b_bn_2 (BatchNormalizati (None, 512, 19, 19)  2048        block_4b_conv_2[0][0]
__________________________________________________________________________________________________
block_4b_bn_shortcut (BatchNorm (None, 512, 19, 19)  2048        block_4b_conv_shortcut[0][0]
__________________________________________________________________________________________________
add_8 (Add)                     (None, 512, 19, 19)  0           block_4b_bn_2[0][0]
                                                                 block_4b_bn_shortcut[0][0]
__________________________________________________________________________________________________
block_4b_relu (Activation)      (None, 512, 19, 19)  0           add_8[0][0]
__________________________________________________________________________________________________
ssd_expand_block_0_conv_0 (Conv (None, 256, 19, 19)  131328      block_4b_relu[0][0]
__________________________________________________________________________________________________
ssd_expand_block_0_relu_0 (ReLU (None, 256, 19, 19)  0           ssd_expand_block_0_conv_0[0][0]
__________________________________________________________________________________________________
ssd_expand_block_0_conv_1 (Conv (None, 256, 19, 19)  589824      ssd_expand_block_0_relu_0[0][0]
__________________________________________________________________________________________________
ssd_expand_block_0_bn_1 (BatchN (None, 256, 19, 19)  1024        ssd_expand_block_0_conv_1[0][0]
__________________________________________________________________________________________________
ssd_expand_block_0_relu_1 (ReLU (None, 256, 19, 19)  0           ssd_expand_block_0_bn_1[0][0]
__________________________________________________________________________________________________
ssd_expand_block_1_conv_0 (Conv (None, 128, 19, 19)  32896       ssd_expand_block_0_relu_1[0][0]
__________________________________________________________________________________________________
ssd_expand_block_1_relu_0 (ReLU (None, 128, 19, 19)  0           ssd_expand_block_1_conv_0[0][0]
__________________________________________________________________________________________________
ssd_expand_block_1_conv_1 (Conv (None, 256, 10, 10)  294912      ssd_expand_block_1_relu_0[0][0]
__________________________________________________________________________________________________
ssd_expand_block_1_bn_1 (BatchN (None, 256, 10, 10)  1024        ssd_expand_block_1_conv_1[0][0]
__________________________________________________________________________________________________
ssd_expand_block_1_relu_1 (ReLU (None, 256, 10, 10)  0           ssd_expand_block_1_bn_1[0][0]
__________________________________________________________________________________________________
ssd_expand_block_2_conv_0 (Conv (None, 64, 10, 10)   16448       ssd_expand_block_1_relu_1[0][0]
__________________________________________________________________________________________________
ssd_expand_block_2_relu_0 (ReLU (None, 64, 10, 10)   0           ssd_expand_block_2_conv_0[0][0]
__________________________________________________________________________________________________
ssd_expand_block_2_conv_1 (Conv (None, 128, 5, 5)    73728       ssd_expand_block_2_relu_0[0][0]
__________________________________________________________________________________________________
ssd_expand_block_2_bn_1 (BatchN (None, 128, 5, 5)    512         ssd_expand_block_2_conv_1[0][0]
__________________________________________________________________________________________________
ssd_expand_block_2_relu_1 (ReLU (None, 128, 5, 5)    0           ssd_expand_block_2_bn_1[0][0]
__________________________________________________________________________________________________
ssd_expand_block_3_conv_0 (Conv (None, 64, 5, 5)     8256        ssd_expand_block_2_relu_1[0][0]
__________________________________________________________________________________________________
ssd_expand_block_3_relu_0 (ReLU (None, 64, 5, 5)     0           ssd_expand_block_3_conv_0[0][0]
__________________________________________________________________________________________________
ssd_expand_block_3_conv_1 (Conv (None, 128, 3, 3)    73728       ssd_expand_block_3_relu_0[0][0]
__________________________________________________________________________________________________
ssd_expand_block_3_bn_1 (BatchN (None, 128, 3, 3)    512         ssd_expand_block_3_conv_1[0][0]
__________________________________________________________________________________________________
ssd_expand_block_3_relu_1 (ReLU (None, 128, 3, 3)    0           ssd_expand_block_3_bn_1[0][0]
__________________________________________________________________________________________________
ssd_expand_block_4_conv_0 (Conv (None, 64, 3, 3)     8256        ssd_expand_block_3_relu_1[0][0]
__________________________________________________________________________________________________
ssd_expand_block_4_relu_0 (ReLU (None, 64, 3, 3)     0           ssd_expand_block_4_conv_0[0][0]
__________________________________________________________________________________________________
ssd_expand_block_4_conv_1 (Conv (None, 128, 2, 2)    73728       ssd_expand_block_4_relu_0[0][0]
__________________________________________________________________________________________________
ssd_expand_block_4_bn_1 (BatchN (None, 128, 2, 2)    512         ssd_expand_block_4_conv_1[0][0]
__________________________________________________________________________________________________
ssd_expand_block_4_relu_1 (ReLU (None, 128, 2, 2)    0           ssd_expand_block_4_bn_1[0][0]
__________________________________________________________________________________________________
ssd_conf_0 (Conv2D)             (None, 8, 38, 38)    9224        block_2b_relu[0][0]
__________________________________________________________________________________________________
ssd_conf_1 (Conv2D)             (None, 8, 19, 19)    18440       ssd_expand_block_0_relu_1[0][0]
__________________________________________________________________________________________________
ssd_conf_2 (Conv2D)             (None, 8, 10, 10)    18440       ssd_expand_block_1_relu_1[0][0]
__________________________________________________________________________________________________
ssd_conf_3 (Conv2D)             (None, 8, 5, 5)      9224        ssd_expand_block_2_relu_1[0][0]
__________________________________________________________________________________________________
ssd_conf_4 (Conv2D)             (None, 8, 3, 3)      9224        ssd_expand_block_3_relu_1[0][0]
__________________________________________________________________________________________________
ssd_conf_5 (Conv2D)             (None, 12, 2, 2)     13836       ssd_expand_block_4_relu_1[0][0]
__________________________________________________________________________________________________
permute_1 (Permute)             (None, 38, 38, 8)    0           ssd_conf_0[0][0]
__________________________________________________________________________________________________
permute_2 (Permute)             (None, 19, 19, 8)    0           ssd_conf_1[0][0]
__________________________________________________________________________________________________
permute_3 (Permute)             (None, 10, 10, 8)    0           ssd_conf_2[0][0]
__________________________________________________________________________________________________
permute_4 (Permute)             (None, 5, 5, 8)      0           ssd_conf_3[0][0]
__________________________________________________________________________________________________
permute_5 (Permute)             (None, 3, 3, 8)      0           ssd_conf_4[0][0]
__________________________________________________________________________________________________
permute_6 (Permute)             (None, 2, 2, 12)     0           ssd_conf_5[0][0]
__________________________________________________________________________________________________
conf_reshape_0 (Reshape)        (None, 5776, 1, 2)   0           permute_1[0][0]
__________________________________________________________________________________________________
conf_reshape_1 (Reshape)        (None, 1444, 1, 2)   0           permute_2[0][0]
__________________________________________________________________________________________________
conf_reshape_2 (Reshape)        (None, 400, 1, 2)    0           permute_3[0][0]
__________________________________________________________________________________________________
conf_reshape_3 (Reshape)        (None, 100, 1, 2)    0           permute_4[0][0]
__________________________________________________________________________________________________
conf_reshape_4 (Reshape)        (None, 36, 1, 2)     0           permute_5[0][0]
__________________________________________________________________________________________________
conf_reshape_5 (Reshape)        (None, 24, 1, 2)     0           permute_6[0][0]
__________________________________________________________________________________________________
mbox_conf (Concatenate)         (None, 7780, 1, 2)   0           conf_reshape_0[0][0]
                                                                 conf_reshape_1[0][0]
                                                                 conf_reshape_2[0][0]
                                                                 conf_reshape_3[0][0]
                                                                 conf_reshape_4[0][0]
                                                                 conf_reshape_5[0][0]
__________________________________________________________________________________________________
ssd_loc_0 (Conv2D)              (None, 16, 38, 38)   18448       block_2b_relu[0][0]
__________________________________________________________________________________________________
ssd_loc_1 (Conv2D)              (None, 16, 19, 19)   36880       ssd_expand_block_0_relu_1[0][0]
__________________________________________________________________________________________________
ssd_loc_2 (Conv2D)              (None, 16, 10, 10)   36880       ssd_expand_block_1_relu_1[0][0]
__________________________________________________________________________________________________
ssd_loc_3 (Conv2D)              (None, 16, 5, 5)     18448       ssd_expand_block_2_relu_1[0][0]
__________________________________________________________________________________________________
ssd_loc_4 (Conv2D)              (None, 16, 3, 3)     18448       ssd_expand_block_3_relu_1[0][0]
__________________________________________________________________________________________________
ssd_loc_5 (Conv2D)              (None, 24, 2, 2)     27672       ssd_expand_block_4_relu_1[0][0]
__________________________________________________________________________________________________
before_softmax_permute (Permute (None, 2, 1, 7780)   0           mbox_conf[0][0]
__________________________________________________________________________________________________
permute_7 (Permute)             (None, 38, 38, 16)   0           ssd_loc_0[0][0]
__________________________________________________________________________________________________
permute_8 (Permute)             (None, 19, 19, 16)   0           ssd_loc_1[0][0]
__________________________________________________________________________________________________
permute_9 (Permute)             (None, 10, 10, 16)   0           ssd_loc_2[0][0]
__________________________________________________________________________________________________
permute_10 (Permute)            (None, 5, 5, 16)     0           ssd_loc_3[0][0]
__________________________________________________________________________________________________
permute_11 (Permute)            (None, 3, 3, 16)     0           ssd_loc_4[0][0]
__________________________________________________________________________________________________
permute_12 (Permute)            (None, 2, 2, 24)     0           ssd_loc_5[0][0]
__________________________________________________________________________________________________
ssd_anchor_0 (AnchorBoxes)      (None, 1444, 4, 8)   0           ssd_loc_0[0][0]
__________________________________________________________________________________________________
ssd_anchor_1 (AnchorBoxes)      (None, 361, 4, 8)    0           ssd_loc_1[0][0]
__________________________________________________________________________________________________
ssd_anchor_2 (AnchorBoxes)      (None, 100, 4, 8)    0           ssd_loc_2[0][0]
__________________________________________________________________________________________________
ssd_anchor_3 (AnchorBoxes)      (None, 25, 4, 8)     0           ssd_loc_3[0][0]
__________________________________________________________________________________________________
ssd_anchor_4 (AnchorBoxes)      (None, 9, 4, 8)      0           ssd_loc_4[0][0]
__________________________________________________________________________________________________
ssd_anchor_5 (AnchorBoxes)      (None, 4, 6, 8)      0           ssd_loc_5[0][0]
__________________________________________________________________________________________________
mbox_conf_softmax_ (Softmax)    (None, 2, 1, 7780)   0           before_softmax_permute[0][0]
__________________________________________________________________________________________________
loc_reshape_0 (Reshape)         (None, 5776, 1, 4)   0           permute_7[0][0]
__________________________________________________________________________________________________
loc_reshape_1 (Reshape)         (None, 1444, 1, 4)   0           permute_8[0][0]
__________________________________________________________________________________________________
loc_reshape_2 (Reshape)         (None, 400, 1, 4)    0           permute_9[0][0]
__________________________________________________________________________________________________
loc_reshape_3 (Reshape)         (None, 100, 1, 4)    0           permute_10[0][0]
__________________________________________________________________________________________________
loc_reshape_4 (Reshape)         (None, 36, 1, 4)     0           permute_11[0][0]
__________________________________________________________________________________________________
loc_reshape_5 (Reshape)         (None, 24, 1, 4)     0           permute_12[0][0]
__________________________________________________________________________________________________
anchor_reshape_0 (Reshape)      (None, 5776, 1, 8)   0           ssd_anchor_0[0][0]
__________________________________________________________________________________________________
anchor_reshape_1 (Reshape)      (None, 1444, 1, 8)   0           ssd_anchor_1[0][0]
__________________________________________________________________________________________________
anchor_reshape_2 (Reshape)      (None, 400, 1, 8)    0           ssd_anchor_2[0][0]
__________________________________________________________________________________________________
anchor_reshape_3 (Reshape)      (None, 100, 1, 8)    0           ssd_anchor_3[0][0]
__________________________________________________________________________________________________
anchor_reshape_4 (Reshape)      (None, 36, 1, 8)     0           ssd_anchor_4[0][0]
__________________________________________________________________________________________________
anchor_reshape_5 (Reshape)      (None, 24, 1, 8)     0           ssd_anchor_5[0][0]
__________________________________________________________________________________________________
mbox_conf_softmax (Permute)     (None, 7780, 1, 2)   0           mbox_conf_softmax_[0][0]
__________________________________________________________________________________________________
mbox_loc (Concatenate)          (None, 7780, 1, 4)   0           loc_reshape_0[0][0]
                                                                 loc_reshape_1[0][0]
                                                                 loc_reshape_2[0][0]
                                                                 loc_reshape_3[0][0]
                                                                 loc_reshape_4[0][0]
                                                                 loc_reshape_5[0][0]
__________________________________________________________________________________________________
mbox_priorbox (Concatenate)     (None, 7780, 1, 8)   0           anchor_reshape_0[0][0]
                                                                 anchor_reshape_1[0][0]
                                                                 anchor_reshape_2[0][0]
                                                                 anchor_reshape_3[0][0]
                                                                 anchor_reshape_4[0][0]
                                                                 anchor_reshape_5[0][0]
__________________________________________________________________________________________________
concatenate_1 (Concatenate)     (None, 7780, 1, 14)  0           mbox_conf_softmax[0][0]
                                                                 mbox_loc[0][0]
                                                                 mbox_priorbox[0][0]
__________________________________________________________________________________________________
ssd_predictions (Reshape)       (None, 7780, 14)     0           concatenate_1[0][0]
==================================================================================================
Total params: 13,084,316
Trainable params: 13,061,468
Non-trainable params: 22,848
__________________________________________________________________________________________________
2021-06-12 13:03:40,535 [INFO] __main__: Number of images in the training dataset:       24600
2021-06-12 13:03:40,535 [INFO] __main__: Number of images in the validation dataset:       360
Traceback (most recent call last):
  File "/home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/ssd/scripts/train.py", line 313, in <module>
  File "/home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/ssd/scripts/train.py", line 309, in main
  File "/home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/ssd/scripts/train.py", line 156, in run_experiment
FileNotFoundError: [Errno 2] No such file or directory: '/workspace/Work/Crowd-Detection/ouput/weights'
Traceback (most recent call last):
  File "/usr/local/bin/ssd", line 8, in <module>
    sys.exit(main())
  File "/home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/ssd/entrypoint/ssd.py", line 12, in main
  File "/home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/entrypoint/entrypoint.py", line 296, in launch_job
AssertionError: Process run failed.

Please $mkdir the ouput folder under your local directory.
Then the path inside the docker /workspace/Work/Crowd-Detection/ouput/ should be available.

The output file already exists in the local directory and inside the docker workspace.

Also, Is it a problem that the layers are not connected to equal layer?

No, the error is not related to the layers.
It is a common error. Need to mkdir -p the folder.
You can refer to the jupyter notebooks for ssd.

!mkdir -p $LOCAL_EXPERIMENT_DIR/experiment_dir_unpruned

!tlt ssd train --gpus 1 --gpu_index=$GPU_INDEX
-e $SPECS_DIR/ssd_train_resnet18_kitti.txt
-r $USER_EXPERIMENT_DIR/experiment_dir_unpruned
-k $KEY
-m $USER_EXPERIMENT_DIR/pretrained_resnet18/tlt_pretrained_object_detection_vresnet18/resnet_18.hdf5

The problem still occurs after doing mkdir -p the folder.

It seems to be looking for a weights file inside the output folder is this something I need to make?

Please mkdir -p xxx/output/weights to check if it helps.

It makes no difference, unfortunately.

There is no update from you for a period, assuming this is not an issue any more.
Hence we are closing this topic. If need further support, please open a new one.
Thanks

What is the training command line?
To narrow down, can you also login the docker and try to train?

I ended up fixing the problem by reinstalling each part and ensuring they worked one at a time.

I created a guide of the process which I will leave below:

Setting up the NVIDA tlt launcher

This will assume that you have nothing pre-installed

Pre-requisites

Firstly follow one of the following guides from Ubuntu or Nvidia to setup Microsft Windows Insider Program, WSL2, NVIDIA drivers, docker and nvidia-docker2.

Both guides cover the same information but use slightly different commands so if one doesn’t work try the other.

For the tlt launcher to use docker it needs to be able to run without needing sudo every time. To achieve this follow this guide from Docker.

To pull the tlt information you need to have make and login into docker and NGC account.

After creating a docker account from docker run:

$ docker login

and enter your login information

After creating an NGC account from NGC click onto setup then generate a API key then run:

$ docker login nvcr.io

and enter “$outhtoken” as the username and the API key as the password

Extra notes:

  • I had to choose the Dev Channel in the Windows Insider Program to get the needed update
  • NVIDIA Container Toolkit does not yet support Docker Desktop WSL 2 backend so make sure to install it into WSL2
  • Run the examples in the guides to make sure everything at this stage is working as intended

Installing the launcher

Follow the NVIDIA guide to install the TLT launcher

Ensure it is working by running:

$ tlt detectnet_v2 --help

With the expected outcome as:

Using TensorFlow backend.
usage: detectnet_v2 [-h] [--gpus GPUS] [--gpu_index GPU_INDEX [GPU_INDEX ...]]
                   [--use_amp] [--log_file LOG_FILE]
                   {calibration_tensorfile,dataset_convert,evaluate,export,inference,prune,train}
                   ...

Transfer Learning Toolkit

optional arguments:
-h, --help            show this help message and exit
--gpus GPUS           The number of GPUs to be used for the job.
--gpu_index GPU_INDEX [GPU_INDEX ...]
                     The indices of the GPU's to be used.
--use_amp             Flag to enable Auto Mixed Precision.
--log_file LOG_FILE   Path to the output log file.

tasks:
{calibration_tensorfile,dataset_convert,evaluate,export,inference,prune,train}

Extra notes:

  • Make sure that you start docker with sudo service docker start before running the launcher

Configuring the launcher

The tlt launcher uses the file ~/.tlt_mounts.json to map the drives/mount points to the docker container it is running and is specified by NVIDIA here

To create and open the file run the following commands:

$ cd ~/
$ sudo touch .tlt_mounts.json
$ code .tlt_mounts.json

Where code can be replaced by your text editor of choice.

Next copy the example file from the NVIDA website into the file and edit the paths so that the source is the full path to the files and the destination is into a workspace that contains all the files needed.
For example:

    "Mounts": [
        {
	    "source": "/mnt/c/Users/USERNAME/project_dir",
            "destination": "/workspace/project_dir"
        },
        {
            "source": "/mnt/c/Users/USERNAME/project_dir/output",
            "destination": "/workspace/project_dir/output"
        },
        {
            "source": "/mnt/c/Users/USERNAME/project_dir/SPEC.txt",
            "destination": "/workspace/project_dir/SPEC.txt"
        }
    ],

Where all the information for the project is contained within “project_dir”

Creating a specification file

To train the model you need a specification file as shown by NVDIA here

To create one either copy the example or go over the different sections and set the values to what you require.

Extra notes:

  • The specification file I used is this
  • batch_size is not shown in the section for Evaluation Config but is required for the training to run
  • The directory to the labels and images should be from the workspace as specified in your ~/.tlt_mounts.json file

Running the training

Information on training the Model can be found here

The directories should be of inside the workspace as specified in your ~/.tlt_mounts.json file and the specification file.
For example the following command would apply to the directories used in the example ~/.tlt_mounts.json file

$ tlt ssd train -e /workspace/project_dir/SPEC.txt -r /workspace/project_dir/output -k KEY

Where KEY can be changed.

If training crashes due to a lack of memory lower the “batch_size” and “batch_size_per_gpu” but keep them in a value of 2^n (2, 4, 8, 16, e.c.t)

Bug fixing

Some of the problems that I discovered and how I fixed them are described bellow

When running sudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi and getting the error:

docker: Error response from daemon: OCI runtime create failed: container_linux.go:380: starting container process caused: exec: "nvidia-smi"
: executable file not found in $PATH: unknown.

Run the following lines:

$ cp /usr/lib/wsl/lib/nvidia-smi /usr/bin/nvidia-smi
$ chmod ogu+x /usr/bin/nvidia-smi
1 Like

Thanks for your sharing! Really appreciate your work!
BTW, you were training with TLT 3.0-dp ,right? Did you ever try TLT3.0 ?