Hi there,
I’m trying to train resnet10 using ssd using tlt but everytime getting the following assertion error, can you please help me out? Thanks.
Details :
ssd_train_resnet10_kitti.txt :
random_seed: 42
ssd_config {
aspect_ratios_global: “[1.0, 2.0, 0.5, 3.0, 1.0/3.0]”
scales: “[0.05, 0.1, 0.25, 0.4, 0.55, 0.7, 0.85]”
two_boxes_for_ar1: true
clip_boxes: false
variances: “[0.1, 0.1, 0.2, 0.2]”
arch: “resnet”
nlayers: 10
freeze_bn: false
freeze_blocks: 0
}
training_config {
batch_size_per_gpu: 16
num_epochs: 80
enable_qat: false
learning_rate {
soft_start_annealing_schedule {
min_learning_rate: 5e-5
max_learning_rate: 2e-2
soft_start: 0.15
annealing: 0.8
}
}
regularizer {
type: L1
weight: 3e-5
}
}
eval_config {
validation_period_during_training: 10
average_precision_mode: SAMPLE
batch_size: 16
matching_iou_threshold: 0.5
}
nms_config {
confidence_threshold: 0.01
clustering_iou_threshold: 0.6
top_k: 200
}
augmentation_config {
output_width: 300
output_height: 300
output_channel: 3
}
dataset_config {
data_sources: {
label_directory_path: “/workspace/tlt-experiments/data/training/label_2”
image_directory_path: “/workspace/tlt-experiments/data/training/image_2”
}
include_difficult_in_training: true
target_class_mapping {
key: “car”
value: “car”
}
target_class_mapping {
key: “pedestrian”
value: “pedestrian”
}
target_class_mapping {
key: “cyclist”
value: “cyclist”
}
target_class_mapping {
key: “van”
value: “car”
}
target_class_mapping {
key: “person_sitting”
value: “pedestrian”
}
validation_data_sources: {
label_directory_path: “/workspace/tlt-experiments/data/val/label”
image_directory_path: “/workspace/tlt-experiments/data/val/image”
}
}
Command :
!echo tlt ssd train --gpus 1 --gpu_index=$GPU_INDEX
-e $SPECS_DIR/ssd_train_resnet10_kitti.txt
-r $USER_EXPERIMENT_DIR/experiment_dir_unpruned
-k $KEY
-m $USER_EXPERIMENT_DIR/pretrained_resnet10/tlt_pretrained_object_detection_vresnet10/resnet_10.hdf5
Error :
To run with multigpu, please change --gpus based on the number of available GPUs in your machine.
2021-06-03 05:31:16,622 [WARNING] tlt.components.docker_handler.docker_handler:
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the “user”:“UID:GID” in the
DockerOptions portion of the ~/.tlt_mounts.json file. You can obtain your
users UID and GID by using the “id -u” and “id -g” commands on the
terminal.
Using TensorFlow backend.
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
Using TensorFlow backend.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/horovod/tensorflow/init.py:117: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead.
2021-06-03 05:31:24,565 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/horovod/tensorflow/init.py:117: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/horovod/tensorflow/init.py:143: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.
2021-06-03 05:31:24,565 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/horovod/tensorflow/init.py:143: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.
WARNING:tensorflow:From /home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/ssd/scripts/train.py:63: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.
2021-06-03 05:31:24,657 [WARNING] tensorflow: From /home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/ssd/scripts/train.py:63: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.
WARNING:tensorflow:From /home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/ssd/scripts/train.py:66: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.
2021-06-03 05:31:24,657 [WARNING] tensorflow: From /home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/ssd/scripts/train.py:66: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.
2021-06-03 05:31:24,795 [INFO] /usr/local/lib/python3.6/dist-packages/iva/ssd/utils/spec_loader.pyc: Merging specification from /workspace/tlt-experiments/ssd/specs/ssd_train_resnet10_kitti.txt
2021-06-03 05:31:24,811 [INFO] main: Loading pretrained weights. This may take a while…
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:517: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.
2021-06-03 05:31:24,812 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:517: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4138: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.
2021-06-03 05:31:24,813 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4138: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:1834: The name tf.nn.fused_batch_norm is deprecated. Please use tf.compat.v1.nn.fused_batch_norm instead.
2021-06-03 05:31:24,836 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:1834: The name tf.nn.fused_batch_norm is deprecated. Please use tf.compat.v1.nn.fused_batch_norm instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4185: The name tf.truncated_normal is deprecated. Please use tf.random.truncated_normal instead.
2021-06-03 05:31:25,312 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4185: The name tf.truncated_normal is deprecated. Please use tf.random.truncated_normal instead.
WARNING:tensorflow:From /opt/nvidia/third_party/keras/tensorflow_backend.py:187: The name tf.nn.avg_pool is deprecated. Please use tf.nn.avg_pool2d instead.
2021-06-03 05:31:27,048 [WARNING] tensorflow: From /opt/nvidia/third_party/keras/tensorflow_backend.py:187: The name tf.nn.avg_pool is deprecated. Please use tf.nn.avg_pool2d instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:174: The name tf.get_default_session is deprecated. Please use tf.compat.v1.get_default_session instead.
2021-06-03 05:31:27,194 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:174: The name tf.get_default_session is deprecated. Please use tf.compat.v1.get_default_session instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:199: The name tf.is_variable_initialized is deprecated. Please use tf.compat.v1.is_variable_initialized instead.
2021-06-03 05:31:27,194 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:199: The name tf.is_variable_initialized is deprecated. Please use tf.compat.v1.is_variable_initialized instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:206: The name tf.variables_initializer is deprecated. Please use tf.compat.v1.variables_initializer instead.
2021-06-03 05:31:27,609 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:206: The name tf.variables_initializer is deprecated. Please use tf.compat.v1.variables_initializer instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/optimizers.py:790: The name tf.train.Optimizer is deprecated. Please use tf.compat.v1.train.Optimizer instead.
2021-06-03 05:31:28,158 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/optimizers.py:790: The name tf.train.Optimizer is deprecated. Please use tf.compat.v1.train.Optimizer instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:3295: The name tf.log is deprecated. Please use tf.math.log instead.
2021-06-03 05:31:28,164 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:3295: The name tf.log is deprecated. Please use tf.math.log instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:986: The name tf.assign_add is deprecated. Please use tf.compat.v1.assign_add instead.
2021-06-03 05:31:28,640 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:986: The name tf.assign_add is deprecated. Please use tf.compat.v1.assign_add instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:973: The name tf.assign is deprecated. Please use tf.compat.v1.assign instead.
2021-06-03 05:31:28,765 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:973: The name tf.assign is deprecated. Please use tf.compat.v1.assign instead.
Initialize optimizer
WARNING:tensorflow:From /home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/ssd/utils/tensor_utils.py:121: The name tf.local_variables_initializer is deprecated. Please use tf.compat.v1.local_variables_initializer instead.
2021-06-03 05:31:37,531 [WARNING] tensorflow: From /home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/ssd/utils/tensor_utils.py:121: The name tf.local_variables_initializer is deprecated. Please use tf.compat.v1.local_variables_initializer instead.
WARNING:tensorflow:From /home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/ssd/utils/tensor_utils.py:122: The name tf.tables_initializer is deprecated. Please use tf.compat.v1.tables_initializer instead.
2021-06-03 05:31:37,531 [WARNING] tensorflow: From /home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/ssd/utils/tensor_utils.py:122: The name tf.tables_initializer is deprecated. Please use tf.compat.v1.tables_initializer instead.
WARNING:tensorflow:From /home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/ssd/utils/tensor_utils.py:123: The name tf.get_collection is deprecated. Please use tf.compat.v1.get_collection instead.
2021-06-03 05:31:37,532 [WARNING] tensorflow: From /home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/ssd/utils/tensor_utils.py:123: The name tf.get_collection is deprecated. Please use tf.compat.v1.get_collection instead.
Layer (type) Output Shape Param # Connected to
Input (InputLayer) (None, 3, 300, 300) 0
conv1 (Conv2D) (None, 64, 150, 150) 9408 Input[0][0]
bn_conv1 (BatchNormalization) (None, 64, 150, 150) 256 conv1[0][0]
activation_1 (Activation) (None, 64, 150, 150) 0 bn_conv1[0][0]
block_1a_conv_1 (Conv2D) (None, 64, 75, 75) 36864 activation_1[0][0]
block_1a_bn_1 (BatchNormalizati (None, 64, 75, 75) 256 block_1a_conv_1[0][0]
block_1a_relu_1 (Activation) (None, 64, 75, 75) 0 block_1a_bn_1[0][0]
block_1a_conv_2 (Conv2D) (None, 64, 75, 75) 36864 block_1a_relu_1[0][0]
block_1a_conv_shortcut (Conv2D) (None, 64, 75, 75) 4096 activation_1[0][0]
block_1a_bn_2 (BatchNormalizati (None, 64, 75, 75) 256 block_1a_conv_2[0][0]
block_1a_bn_shortcut (BatchNorm (None, 64, 75, 75) 256 block_1a_conv_shortcut[0][0]
add_1 (Add) (None, 64, 75, 75) 0 block_1a_bn_2[0][0]
block_1a_bn_shortcut[0][0]
block_1a_relu (Activation) (None, 64, 75, 75) 0 add_1[0][0]
block_2a_conv_1 (Conv2D) (None, 128, 38, 38) 73728 block_1a_relu[0][0]
block_2a_bn_1 (BatchNormalizati (None, 128, 38, 38) 512 block_2a_conv_1[0][0]
block_2a_relu_1 (Activation) (None, 128, 38, 38) 0 block_2a_bn_1[0][0]
block_2a_conv_2 (Conv2D) (None, 128, 38, 38) 147456 block_2a_relu_1[0][0]
block_2a_conv_shortcut (Conv2D) (None, 128, 38, 38) 8192 block_1a_relu[0][0]
block_2a_bn_2 (BatchNormalizati (None, 128, 38, 38) 512 block_2a_conv_2[0][0]
block_2a_bn_shortcut (BatchNorm (None, 128, 38, 38) 512 block_2a_conv_shortcut[0][0]
add_2 (Add) (None, 128, 38, 38) 0 block_2a_bn_2[0][0]
block_2a_bn_shortcut[0][0]
block_2a_relu (Activation) (None, 128, 38, 38) 0 add_2[0][0]
block_3a_conv_1 (Conv2D) (None, 256, 19, 19) 294912 block_2a_relu[0][0]
block_3a_bn_1 (BatchNormalizati (None, 256, 19, 19) 1024 block_3a_conv_1[0][0]
block_3a_relu_1 (Activation) (None, 256, 19, 19) 0 block_3a_bn_1[0][0]
block_3a_conv_2 (Conv2D) (None, 256, 19, 19) 589824 block_3a_relu_1[0][0]
block_3a_conv_shortcut (Conv2D) (None, 256, 19, 19) 32768 block_2a_relu[0][0]
block_3a_bn_2 (BatchNormalizati (None, 256, 19, 19) 1024 block_3a_conv_2[0][0]
block_3a_bn_shortcut (BatchNorm (None, 256, 19, 19) 1024 block_3a_conv_shortcut[0][0]
add_3 (Add) (None, 256, 19, 19) 0 block_3a_bn_2[0][0]
block_3a_bn_shortcut[0][0]
block_3a_relu (Activation) (None, 256, 19, 19) 0 add_3[0][0]
block_4a_conv_1 (Conv2D) (None, 512, 19, 19) 1179648 block_3a_relu[0][0]
block_4a_bn_1 (BatchNormalizati (None, 512, 19, 19) 2048 block_4a_conv_1[0][0]
block_4a_relu_1 (Activation) (None, 512, 19, 19) 0 block_4a_bn_1[0][0]
block_4a_conv_2 (Conv2D) (None, 512, 19, 19) 2359296 block_4a_relu_1[0][0]
block_4a_conv_shortcut (Conv2D) (None, 512, 19, 19) 131072 block_3a_relu[0][0]
block_4a_bn_2 (BatchNormalizati (None, 512, 19, 19) 2048 block_4a_conv_2[0][0]
block_4a_bn_shortcut (BatchNorm (None, 512, 19, 19) 2048 block_4a_conv_shortcut[0][0]
add_4 (Add) (None, 512, 19, 19) 0 block_4a_bn_2[0][0]
block_4a_bn_shortcut[0][0]
block_4a_relu (Activation) (None, 512, 19, 19) 0 add_4[0][0]
ssd_expand_block_1_conv_0 (Conv (None, 64, 19, 19) 32832 block_4a_relu[0][0]
ssd_expand_block_1_relu_0 (ReLU (None, 64, 19, 19) 0 ssd_expand_block_1_conv_0[0][0]
ssd_expand_block_1_conv_1 (Conv (None, 128, 10, 10) 73728 ssd_expand_block_1_relu_0[0][0]
ssd_expand_block_1_bn_1 (BatchN (None, 128, 10, 10) 512 ssd_expand_block_1_conv_1[0][0]
ssd_expand_block_1_relu_1 (ReLU (None, 128, 10, 10) 0 ssd_expand_block_1_bn_1[0][0]
ssd_expand_block_2_conv_0 (Conv (None, 64, 10, 10) 8256 ssd_expand_block_1_relu_1[0][0]
ssd_expand_block_2_relu_0 (ReLU (None, 64, 10, 10) 0 ssd_expand_block_2_conv_0[0][0]
ssd_expand_block_2_conv_1 (Conv (None, 128, 5, 5) 73728 ssd_expand_block_2_relu_0[0][0]
ssd_expand_block_2_bn_1 (BatchN (None, 128, 5, 5) 512 ssd_expand_block_2_conv_1[0][0]
ssd_expand_block_2_relu_1 (ReLU (None, 128, 5, 5) 0 ssd_expand_block_2_bn_1[0][0]
ssd_expand_block_3_conv_0 (Conv (None, 64, 5, 5) 8256 ssd_expand_block_2_relu_1[0][0]
ssd_expand_block_3_relu_0 (ReLU (None, 64, 5, 5) 0 ssd_expand_block_3_conv_0[0][0]
ssd_expand_block_3_conv_1 (Conv (None, 128, 3, 3) 73728 ssd_expand_block_3_relu_0[0][0]
ssd_expand_block_3_bn_1 (BatchN (None, 128, 3, 3) 512 ssd_expand_block_3_conv_1[0][0]
ssd_expand_block_3_relu_1 (ReLU (None, 128, 3, 3) 0 ssd_expand_block_3_bn_1[0][0]
ssd_expand_block_4_conv_0 (Conv (None, 64, 3, 3) 8256 ssd_expand_block_3_relu_1[0][0]
ssd_expand_block_4_relu_0 (ReLU (None, 64, 3, 3) 0 ssd_expand_block_4_conv_0[0][0]
ssd_expand_block_4_conv_1 (Conv (None, 128, 2, 2) 73728 ssd_expand_block_4_relu_0[0][0]
ssd_expand_block_4_bn_1 (BatchN (None, 128, 2, 2) 512 ssd_expand_block_4_conv_1[0][0]
ssd_expand_block_4_relu_1 (ReLU (None, 128, 2, 2) 0 ssd_expand_block_4_bn_1[0][0]
ssd_conf_0 (Conv2D) (None, 24, 38, 38) 27672 block_2a_relu[0][0]
ssd_conf_1 (Conv2D) (None, 24, 19, 19) 110616 block_4a_relu[0][0]
ssd_conf_2 (Conv2D) (None, 24, 10, 10) 27672 ssd_expand_block_1_relu_1[0][0]
ssd_conf_3 (Conv2D) (None, 24, 5, 5) 27672 ssd_expand_block_2_relu_1[0][0]
ssd_conf_4 (Conv2D) (None, 24, 3, 3) 27672 ssd_expand_block_3_relu_1[0][0]
ssd_conf_5 (Conv2D) (None, 24, 2, 2) 27672 ssd_expand_block_4_relu_1[0][0]
permute_1 (Permute) (None, 38, 38, 24) 0 ssd_conf_0[0][0]
permute_2 (Permute) (None, 19, 19, 24) 0 ssd_conf_1[0][0]
permute_3 (Permute) (None, 10, 10, 24) 0 ssd_conf_2[0][0]
permute_4 (Permute) (None, 5, 5, 24) 0 ssd_conf_3[0][0]
permute_5 (Permute) (None, 3, 3, 24) 0 ssd_conf_4[0][0]
permute_6 (Permute) (None, 2, 2, 24) 0 ssd_conf_5[0][0]
conf_reshape_0 (Reshape) (None, 8664, 1, 4) 0 permute_1[0][0]
conf_reshape_1 (Reshape) (None, 2166, 1, 4) 0 permute_2[0][0]
conf_reshape_2 (Reshape) (None, 600, 1, 4) 0 permute_3[0][0]
conf_reshape_3 (Reshape) (None, 150, 1, 4) 0 permute_4[0][0]
conf_reshape_4 (Reshape) (None, 54, 1, 4) 0 permute_5[0][0]
conf_reshape_5 (Reshape) (None, 24, 1, 4) 0 permute_6[0][0]
mbox_conf (Concatenate) (None, 11658, 1, 4) 0 conf_reshape_0[0][0]
conf_reshape_1[0][0]
conf_reshape_2[0][0]
conf_reshape_3[0][0]
conf_reshape_4[0][0]
conf_reshape_5[0][0]
ssd_loc_0 (Conv2D) (None, 24, 38, 38) 27672 block_2a_relu[0][0]
ssd_loc_1 (Conv2D) (None, 24, 19, 19) 110616 block_4a_relu[0][0]
ssd_loc_2 (Conv2D) (None, 24, 10, 10) 27672 ssd_expand_block_1_relu_1[0][0]
ssd_loc_3 (Conv2D) (None, 24, 5, 5) 27672 ssd_expand_block_2_relu_1[0][0]
ssd_loc_4 (Conv2D) (None, 24, 3, 3) 27672 ssd_expand_block_3_relu_1[0][0]
ssd_loc_5 (Conv2D) (None, 24, 2, 2) 27672 ssd_expand_block_4_relu_1[0][0]
before_softmax_permute (Permute (None, 4, 1, 11658) 0 mbox_conf[0][0]
permute_7 (Permute) (None, 38, 38, 24) 0 ssd_loc_0[0][0]
permute_8 (Permute) (None, 19, 19, 24) 0 ssd_loc_1[0][0]
permute_9 (Permute) (None, 10, 10, 24) 0 ssd_loc_2[0][0]
permute_10 (Permute) (None, 5, 5, 24) 0 ssd_loc_3[0][0]
permute_11 (Permute) (None, 3, 3, 24) 0 ssd_loc_4[0][0]
permute_12 (Permute) (None, 2, 2, 24) 0 ssd_loc_5[0][0]
ssd_anchor_0 (AnchorBoxes) (None, 1444, 6, 8) 0 ssd_loc_0[0][0]
ssd_anchor_1 (AnchorBoxes) (None, 361, 6, 8) 0 ssd_loc_1[0][0]
ssd_anchor_2 (AnchorBoxes) (None, 100, 6, 8) 0 ssd_loc_2[0][0]
ssd_anchor_3 (AnchorBoxes) (None, 25, 6, 8) 0 ssd_loc_3[0][0]
ssd_anchor_4 (AnchorBoxes) (None, 9, 6, 8) 0 ssd_loc_4[0][0]
ssd_anchor_5 (AnchorBoxes) (None, 4, 6, 8) 0 ssd_loc_5[0][0]
mbox_conf_softmax_ (Softmax) (None, 4, 1, 11658) 0 before_softmax_permute[0][0]
loc_reshape_0 (Reshape) (None, 8664, 1, 4) 0 permute_7[0][0]
loc_reshape_1 (Reshape) (None, 2166, 1, 4) 0 permute_8[0][0]
loc_reshape_2 (Reshape) (None, 600, 1, 4) 0 permute_9[0][0]
loc_reshape_3 (Reshape) (None, 150, 1, 4) 0 permute_10[0][0]
loc_reshape_4 (Reshape) (None, 54, 1, 4) 0 permute_11[0][0]
loc_reshape_5 (Reshape) (None, 24, 1, 4) 0 permute_12[0][0]
anchor_reshape_0 (Reshape) (None, 8664, 1, 8) 0 ssd_anchor_0[0][0]
anchor_reshape_1 (Reshape) (None, 2166, 1, 8) 0 ssd_anchor_1[0][0]
anchor_reshape_2 (Reshape) (None, 600, 1, 8) 0 ssd_anchor_2[0][0]
anchor_reshape_3 (Reshape) (None, 150, 1, 8) 0 ssd_anchor_3[0][0]
anchor_reshape_4 (Reshape) (None, 54, 1, 8) 0 ssd_anchor_4[0][0]
anchor_reshape_5 (Reshape) (None, 24, 1, 8) 0 ssd_anchor_5[0][0]
mbox_conf_softmax (Permute) (None, 11658, 1, 4) 0 mbox_conf_softmax_[0][0]
mbox_loc (Concatenate) (None, 11658, 1, 4) 0 loc_reshape_0[0][0]
loc_reshape_1[0][0]
loc_reshape_2[0][0]
loc_reshape_3[0][0]
loc_reshape_4[0][0]
loc_reshape_5[0][0]
mbox_priorbox (Concatenate) (None, 11658, 1, 8) 0 anchor_reshape_0[0][0]
anchor_reshape_1[0][0]
anchor_reshape_2[0][0]
anchor_reshape_3[0][0]
anchor_reshape_4[0][0]
anchor_reshape_5[0][0]
concatenate_1 (Concatenate) (None, 11658, 1, 16) 0 mbox_conf_softmax[0][0]
mbox_loc[0][0]
mbox_priorbox[0][0]
ssd_predictions (Reshape) (None, 11658, 16) 0 concatenate_1[0][0]
Total params: 5,768,416
Trainable params: 5,752,096
Non-trainable params: 16,320
2021-06-03 05:31:37,702 [INFO] main: Number of images in the training dataset: 6733
2021-06-03 05:31:37,702 [INFO] main: Number of images in the validation dataset: 748
Epoch 1/80
Traceback (most recent call last):
File “/home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/ssd/scripts/train.py”, line 313, in
File “/home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/ssd/scripts/train.py”, line 309, in main
File “/home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/ssd/scripts/train.py”, line 237, in run_experiment
File “/usr/local/lib/python3.6/dist-packages/keras/legacy/interfaces.py”, line 91, in wrapper
return func(*args, **kwargs)
File “/usr/local/lib/python3.6/dist-packages/keras/engine/training.py”, line 1418, in fit_generator
initial_epoch=initial_epoch)
File “/usr/local/lib/python3.6/dist-packages/keras/engine/training_generator.py”, line 217, in fit_generator
class_weight=class_weight)
File “/usr/local/lib/python3.6/dist-packages/keras/engine/training.py”, line 1217, in train_on_batch
outputs = self.train_function(ins)
File “/usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py”, line 2715, in call
return self._call(inputs)
File “/usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py”, line 2671, in _call
session)
File “/usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py”, line 2623, in _make_callable
callable_fn = session._make_callable_from_options(callable_opts)
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py”, line 1505, in _make_callable_from_options
return BaseSession._Callable(self, callable_options)
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py”, line 1460, in init
session._session, options_ptr)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Conv2DCustomBackpropInputOp only supports NHWC.
[[{{node training_1/SGD/gradients/ssd_loc_0/convolution_grad/Conv2DBackpropInput}}]]
Traceback (most recent call last):
File “/usr/local/bin/ssd”, line 8, in
sys.exit(main())
File “/home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/ssd/entrypoint/ssd.py”, line 12, in main
File “/home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/entrypoint/entrypoint.py”, line 296, in launch_job
AssertionError: Process run failed.
2021-06-03 05:31:46,060 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.