Training custom model using Yolo_v4_tiny

Hi,

I’m have trained a custom model using the the yolo_v4_tiny Jupyter Notebook sample. Once the training and pruning of the model has completed, I tried to re-train but then get the following error:
ValueError: Error when checking input: expected Input to have shape (3, 480, 640) but got array with shape (3, 736, 896)

My dataset have images of various resolutions and I have set the augmentation_config as follows (Anchors also calculated on 640 x 480:

augmentation_config {
hue: 0.1
saturation: 1.5
exposure:1.5
vertical_flip:0
horizontal_flip: 0.5
jitter: 0.3
output_width: 640
output_height: 480
output_channel: 3
randomize_input_shape_period: 10
mosaic_prob: 0.5
mosaic_min_ratio:0.2
}

Did you have the full log of pruning and retraining?

Hi Morgan,

Here is the prune log:
2021-12-03 14:22:22,819 [INFO] root: Registry: [‘nvcr.io’]
2021-12-03 14:22:22,881 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit-tf:v3.21.11-tf1.15.5-py3
2021-12-03 14:22:22,894 [WARNING] tlt.components.docker_handler.docker_handler:
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the “user”:“UID:GID” in the
DockerOptions portion of the “/home/buhund/.tao_mounts.json” file. You can obtain your
users UID and GID by using the “id -u” and “id -g” commands on the
terminal.
Using TensorFlow backend.
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
Using TensorFlow backend.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:517: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

2021-12-03 12:22:30,417 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:517: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4138: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.

2021-12-03 12:22:30,436 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4138: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:245: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.

2021-12-03 12:22:30,448 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:245: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:1834: The name tf.nn.fused_batch_norm is deprecated. Please use tf.compat.v1.nn.fused_batch_norm instead.

2021-12-03 12:22:30,459 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:1834: The name tf.nn.fused_batch_norm is deprecated. Please use tf.compat.v1.nn.fused_batch_norm instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:133: The name tf.placeholder_with_default is deprecated. Please use tf.compat.v1.placeholder_with_default instead.

2021-12-03 12:22:30,466 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:133: The name tf.placeholder_with_default is deprecated. Please use tf.compat.v1.placeholder_with_default instead.

WARNING:tensorflow:From /opt/nvidia/third_party/keras/tensorflow_backend.py:183: The name tf.nn.max_pool is deprecated. Please use tf.nn.max_pool2d instead.

2021-12-03 12:22:30,639 [WARNING] tensorflow: From /opt/nvidia/third_party/keras/tensorflow_backend.py:183: The name tf.nn.max_pool is deprecated. Please use tf.nn.max_pool2d instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:2018: The name tf.image.resize_nearest_neighbor is deprecated. Please use tf.compat.v1.image.resize_nearest_neighbor instead.

2021-12-03 12:22:30,977 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:2018: The name tf.image.resize_nearest_neighbor is deprecated. Please use tf.compat.v1.image.resize_nearest_neighbor instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:174: The name tf.get_default_session is deprecated. Please use tf.compat.v1.get_default_session instead.

2021-12-03 12:22:31,252 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:174: The name tf.get_default_session is deprecated. Please use tf.compat.v1.get_default_session instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:181: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.

2021-12-03 12:22:31,252 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:181: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:186: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.

2021-12-03 12:22:31,252 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:186: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:190: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead.

2021-12-03 12:22:31,590 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:190: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:199: The name tf.is_variable_initialized is deprecated. Please use tf.compat.v1.is_variable_initialized instead.

2021-12-03 12:22:31,591 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:199: The name tf.is_variable_initialized is deprecated. Please use tf.compat.v1.is_variable_initialized instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:206: The name tf.variables_initializer is deprecated. Please use tf.compat.v1.variables_initializer instead.

2021-12-03 12:22:31,790 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:206: The name tf.variables_initializer is deprecated. Please use tf.compat.v1.variables_initializer instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/optimizers.py:790: The name tf.train.Optimizer is deprecated. Please use tf.compat.v1.train.Optimizer instead.

2021-12-03 12:22:32,025 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/optimizers.py:790: The name tf.train.Optimizer is deprecated. Please use tf.compat.v1.train.Optimizer instead.

WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/losses/base_loss.py:40: The name tf.log is deprecated. Please use tf.math.log instead.

2021-12-03 12:22:32,092 [WARNING] tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/losses/base_loss.py:40: The name tf.log is deprecated. Please use tf.math.log instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:986: The name tf.assign_add is deprecated. Please use tf.compat.v1.assign_add instead.

2021-12-03 12:22:33,947 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:986: The name tf.assign_add is deprecated. Please use tf.compat.v1.assign_add instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:973: The name tf.assign is deprecated. Please use tf.compat.v1.assign instead.

2021-12-03 12:22:34,388 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:973: The name tf.assign is deprecated. Please use tf.compat.v1.assign instead.

2021-12-03 12:22:36,494 [INFO] modulus.pruning.pruning: Exploring graph for retainable indices
2021-12-03 12:22:38,491 [INFO] modulus.pruning.pruning: Pruning model and appending pruned nodes to new graph
2021-12-03 12:23:14,108 [INFO] main: Pruning ratio (pruned model / original model): 1.0
2021-12-03 14:23:15,805 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.

and here is the re-train log:

2021-12-03 15:10:40,194 [INFO] root: Registry: [‘nvcr.io’]
2021-12-03 15:10:40,257 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit-tf:v3.21.11-tf1.15.5-py3
2021-12-03 15:10:40,270 [WARNING] tlt.components.docker_handler.docker_handler:
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the “user”:“UID:GID” in the
DockerOptions portion of the “/home/buhund/.tao_mounts.json” file. You can obtain your
users UID and GID by using the “id -u” and “id -g” commands on the
terminal.
Using TensorFlow backend.
Using TensorFlow backend.
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo_v4/scripts/train.py:40: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.

2021-12-03 13:10:47,181 [WARNING] tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo_v4/scripts/train.py:40: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.

WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo_v4/scripts/train.py:43: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.

2021-12-03 13:10:47,181 [WARNING] tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo_v4/scripts/train.py:43: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:153: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.

2021-12-03 13:10:47,522 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:153: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.

WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo_v3/data_loader/generate_shape_tensors.py:8: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead.

2021-12-03 13:10:47,542 [WARNING] tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo_v3/data_loader/generate_shape_tensors.py:8: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead.

WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo_v3/data_loader/generate_shape_tensors.py:8: The name tf.AUTO_REUSE is deprecated. Please use tf.compat.v1.AUTO_REUSE instead.

2021-12-03 13:10:47,542 [WARNING] tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo_v3/data_loader/generate_shape_tensors.py:8: The name tf.AUTO_REUSE is deprecated. Please use tf.compat.v1.AUTO_REUSE instead.

WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo_v3/data_loader/generate_shape_tensors.py:9: The name tf.get_variable is deprecated. Please use tf.compat.v1.get_variable instead.

2021-12-03 13:10:47,543 [WARNING] tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo_v3/data_loader/generate_shape_tensors.py:9: The name tf.get_variable is deprecated. Please use tf.compat.v1.get_variable instead.

WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo_v3/data_loader/generate_shape_tensors.py:55: The name tf.assign is deprecated. Please use tf.compat.v1.assign instead.

2021-12-03 13:10:47,547 [WARNING] tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo_v3/data_loader/generate_shape_tensors.py:55: The name tf.assign is deprecated. Please use tf.compat.v1.assign instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:517: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

2021-12-03 13:10:47,571 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:517: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4138: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.

2021-12-03 13:10:47,573 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4138: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:1834: The name tf.nn.fused_batch_norm is deprecated. Please use tf.compat.v1.nn.fused_batch_norm instead.

2021-12-03 13:10:47,589 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:1834: The name tf.nn.fused_batch_norm is deprecated. Please use tf.compat.v1.nn.fused_batch_norm instead.

WARNING:tensorflow:From /opt/nvidia/third_party/keras/tensorflow_backend.py:183: The name tf.nn.max_pool is deprecated. Please use tf.nn.max_pool2d instead.

2021-12-03 13:10:47,731 [WARNING] tensorflow: From /opt/nvidia/third_party/keras/tensorflow_backend.py:183: The name tf.nn.max_pool is deprecated. Please use tf.nn.max_pool2d instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:2018: The name tf.image.resize_nearest_neighbor is deprecated. Please use tf.compat.v1.image.resize_nearest_neighbor instead.

2021-12-03 13:10:48,036 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:2018: The name tf.image.resize_nearest_neighbor is deprecated. Please use tf.compat.v1.image.resize_nearest_neighbor instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:174: The name tf.get_default_session is deprecated. Please use tf.compat.v1.get_default_session instead.

2021-12-03 13:10:49,293 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:174: The name tf.get_default_session is deprecated. Please use tf.compat.v1.get_default_session instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:190: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead.

2021-12-03 13:10:49,293 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:190: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:199: The name tf.is_variable_initialized is deprecated. Please use tf.compat.v1.is_variable_initialized instead.

2021-12-03 13:10:49,293 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:199: The name tf.is_variable_initialized is deprecated. Please use tf.compat.v1.is_variable_initialized instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:206: The name tf.variables_initializer is deprecated. Please use tf.compat.v1.variables_initializer instead.

2021-12-03 13:10:49,568 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:206: The name tf.variables_initializer is deprecated. Please use tf.compat.v1.variables_initializer instead.

/usr/local/lib/python3.6/dist-packages/keras/engine/saving.py:292: UserWarning: No training configuration found in save file: the model was not compiled. Compile it manually.
warnings.warn('No training configuration found in save file: ’
WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/losses/base_loss.py:40: The name tf.log is deprecated. Please use tf.math.log instead.

2021-12-03 13:10:55,211 [WARNING] tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/losses/base_loss.py:40: The name tf.log is deprecated. Please use tf.math.log instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/optimizers.py:790: The name tf.train.Optimizer is deprecated. Please use tf.compat.v1.train.Optimizer instead.

2021-12-03 13:10:55,354 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/optimizers.py:790: The name tf.train.Optimizer is deprecated. Please use tf.compat.v1.train.Optimizer instead.


Layer (type) Output Shape Param # Connected to

Input (InputLayer) (None, 3, 480, 640) 0


conv_0 (Conv2D) (None, 32, 240, 320) 864 Input[0][0]


conv_0_bn (BatchNormalization) (None, 32, 240, 320) 128 conv_0[0][0]


conv_0_mish (LeakyReLU) (None, 32, 240, 320) 0 conv_0_bn[0][0]


conv_1 (Conv2D) (None, 64, 120, 160) 18432 conv_0_mish[0][0]


conv_1_bn (BatchNormalization) (None, 64, 120, 160) 256 conv_1[0][0]


conv_1_mish (LeakyReLU) (None, 64, 120, 160) 0 conv_1_bn[0][0]


conv_2_conv_0 (Conv2D) (None, 64, 120, 160) 36864 conv_1_mish[0][0]


conv_2_conv_0_bn (BatchNormaliz (None, 64, 120, 160) 256 conv_2_conv_0[0][0]


conv_2_conv_0_mish (LeakyReLU) (None, 64, 120, 160) 0 conv_2_conv_0_bn[0][0]


conv_2_split_0 (Split) (None, 32, 120, 160) 0 conv_2_conv_0_mish[0][0]


conv_2_conv_1 (Conv2D) (None, 32, 120, 160) 9216 conv_2_split_0[0][0]


conv_2_conv_1_bn (BatchNormaliz (None, 32, 120, 160) 128 conv_2_conv_1[0][0]


conv_2_conv_1_mish (LeakyReLU) (None, 32, 120, 160) 0 conv_2_conv_1_bn[0][0]


conv_2_conv_2 (Conv2D) (None, 32, 120, 160) 9216 conv_2_conv_1_mish[0][0]


conv_2_conv_2_bn (BatchNormaliz (None, 32, 120, 160) 128 conv_2_conv_2[0][0]


conv_2_conv_2_mish (LeakyReLU) (None, 32, 120, 160) 0 conv_2_conv_2_bn[0][0]


conv_2_concat_0 (Concatenate) (None, 64, 120, 160) 0 conv_2_conv_2_mish[0][0]
conv_2_conv_1_mish[0][0]


conv_2_conv_3 (Conv2D) (None, 64, 120, 160) 4096 conv_2_concat_0[0][0]


conv_2_conv_3_bn (BatchNormaliz (None, 64, 120, 160) 256 conv_2_conv_3[0][0]


conv_2_conv_3_mish (LeakyReLU) (None, 64, 120, 160) 0 conv_2_conv_3_bn[0][0]


conv_2_concat_1 (Concatenate) (None, 128, 120, 160 0 conv_2_conv_0_mish[0][0]
conv_2_conv_3_mish[0][0]


conv_2_pool_0 (MaxPooling2D) (None, 128, 60, 80) 0 conv_2_concat_1[0][0]


conv_3_conv_0 (Conv2D) (None, 128, 60, 80) 147456 conv_2_pool_0[0][0]


conv_3_conv_0_bn (BatchNormaliz (None, 128, 60, 80) 512 conv_3_conv_0[0][0]


conv_3_conv_0_mish (LeakyReLU) (None, 128, 60, 80) 0 conv_3_conv_0_bn[0][0]


conv_3_split_0 (Split) (None, 64, 60, 80) 0 conv_3_conv_0_mish[0][0]


conv_3_conv_1 (Conv2D) (None, 64, 60, 80) 36864 conv_3_split_0[0][0]


conv_3_conv_1_bn (BatchNormaliz (None, 64, 60, 80) 256 conv_3_conv_1[0][0]


conv_3_conv_1_mish (LeakyReLU) (None, 64, 60, 80) 0 conv_3_conv_1_bn[0][0]


conv_3_conv_2 (Conv2D) (None, 64, 60, 80) 36864 conv_3_conv_1_mish[0][0]


conv_3_conv_2_bn (BatchNormaliz (None, 64, 60, 80) 256 conv_3_conv_2[0][0]


conv_3_conv_2_mish (LeakyReLU) (None, 64, 60, 80) 0 conv_3_conv_2_bn[0][0]


conv_3_concat_0 (Concatenate) (None, 128, 60, 80) 0 conv_3_conv_2_mish[0][0]
conv_3_conv_1_mish[0][0]


conv_3_conv_3 (Conv2D) (None, 128, 60, 80) 16384 conv_3_concat_0[0][0]


conv_3_conv_3_bn (BatchNormaliz (None, 128, 60, 80) 512 conv_3_conv_3[0][0]


conv_3_conv_3_mish (LeakyReLU) (None, 128, 60, 80) 0 conv_3_conv_3_bn[0][0]


conv_3_concat_1 (Concatenate) (None, 256, 60, 80) 0 conv_3_conv_0_mish[0][0]
conv_3_conv_3_mish[0][0]


conv_3_pool_0 (MaxPooling2D) (None, 256, 30, 40) 0 conv_3_concat_1[0][0]


conv_4_conv_0 (Conv2D) (None, 256, 30, 40) 589824 conv_3_pool_0[0][0]


conv_4_conv_0_bn (BatchNormaliz (None, 256, 30, 40) 1024 conv_4_conv_0[0][0]


conv_4_conv_0_mish (LeakyReLU) (None, 256, 30, 40) 0 conv_4_conv_0_bn[0][0]


conv_4_split_0 (Split) (None, 128, 30, 40) 0 conv_4_conv_0_mish[0][0]


conv_4_conv_1 (Conv2D) (None, 128, 30, 40) 147456 conv_4_split_0[0][0]


conv_4_conv_1_bn (BatchNormaliz (None, 128, 30, 40) 512 conv_4_conv_1[0][0]


conv_4_conv_1_mish (LeakyReLU) (None, 128, 30, 40) 0 conv_4_conv_1_bn[0][0]


conv_4_conv_2 (Conv2D) (None, 128, 30, 40) 147456 conv_4_conv_1_mish[0][0]


conv_4_conv_2_bn (BatchNormaliz (None, 128, 30, 40) 512 conv_4_conv_2[0][0]


conv_4_conv_2_mish (LeakyReLU) (None, 128, 30, 40) 0 conv_4_conv_2_bn[0][0]


conv_4_concat_0 (Concatenate) (None, 256, 30, 40) 0 conv_4_conv_2_mish[0][0]
conv_4_conv_1_mish[0][0]


conv_4_conv_3 (Conv2D) (None, 256, 30, 40) 65536 conv_4_concat_0[0][0]


conv_4_conv_3_bn (BatchNormaliz (None, 256, 30, 40) 1024 conv_4_conv_3[0][0]


conv_4_conv_3_mish (LeakyReLU) (None, 256, 30, 40) 0 conv_4_conv_3_bn[0][0]


conv_4_concat_1 (Concatenate) (None, 512, 30, 40) 0 conv_4_conv_0_mish[0][0]
conv_4_conv_3_mish[0][0]


conv_4_pool_0 (MaxPooling2D) (None, 512, 15, 20) 0 conv_4_concat_1[0][0]


conv_5 (Conv2D) (None, 512, 15, 20) 2359296 conv_4_pool_0[0][0]


conv_5_bn (BatchNormalization) (None, 512, 15, 20) 2048 conv_5[0][0]


conv_5_mish (LeakyReLU) (None, 512, 15, 20) 0 conv_5_bn[0][0]


yolo_conv1_1 (Conv2D) (None, 256, 15, 20) 131072 conv_5_mish[0][0]


yolo_conv1_1_bn (BatchNormaliza (None, 256, 15, 20) 1024 yolo_conv1_1[0][0]


yolo_conv1_1_lrelu (LeakyReLU) (None, 256, 15, 20) 0 yolo_conv1_1_bn[0][0]


yolo_conv2 (Conv2D) (None, 128, 15, 20) 32768 yolo_conv1_1_lrelu[0][0]


yolo_conv2_bn (BatchNormalizati (None, 128, 15, 20) 512 yolo_conv2[0][0]


yolo_conv2_lrelu (LeakyReLU) (None, 128, 15, 20) 0 yolo_conv2_bn[0][0]


upsample0 (UpSampling2D) (None, 128, 30, 40) 0 yolo_conv2_lrelu[0][0]


concatenate_2 (Concatenate) (None, 384, 30, 40) 0 upsample0[0][0]
conv_4_conv_3_mish[0][0]


yolo_conv1_6 (Conv2D) (None, 512, 15, 20) 1179648 yolo_conv1_1_lrelu[0][0]


yolo_conv3_6 (Conv2D) (None, 256, 30, 40) 884736 concatenate_2[0][0]


yolo_conv1_6_bn (BatchNormaliza (None, 512, 15, 20) 2048 yolo_conv1_6[0][0]


yolo_conv3_6_bn (BatchNormaliza (None, 256, 30, 40) 1024 yolo_conv3_6[0][0]


yolo_conv1_6_lrelu (LeakyReLU) (None, 512, 15, 20) 0 yolo_conv1_6_bn[0][0]


yolo_conv3_6_lrelu (LeakyReLU) (None, 256, 30, 40) 0 yolo_conv3_6_bn[0][0]


conv_big_object (Conv2D) (None, 24, 15, 20) 12312 yolo_conv1_6_lrelu[0][0]


conv_mid_object (Conv2D) (None, 24, 30, 40) 6168 yolo_conv3_6_lrelu[0][0]


bg_permute (Permute) (None, 15, 20, 24) 0 conv_big_object[0][0]


md_permute (Permute) (None, 30, 40, 24) 0 conv_mid_object[0][0]


bg_reshape (Reshape) (None, 900, 8) 0 bg_permute[0][0]


md_reshape (Reshape) (None, 3600, 8) 0 md_permute[0][0]


bg_anchor (YOLOAnchorBox) (None, 900, 6) 0 conv_big_object[0][0]


bg_bbox_processor (BBoxPostProc (None, 900, 8) 0 bg_reshape[0][0]


md_anchor (YOLOAnchorBox) (None, 3600, 6) 0 conv_mid_object[0][0]


md_bbox_processor (BBoxPostProc (None, 3600, 8) 0 md_reshape[0][0]


encoded_bg (Concatenate) (None, 900, 14) 0 bg_anchor[0][0]
bg_bbox_processor[0][0]


encoded_md (Concatenate) (None, 3600, 14) 0 md_anchor[0][0]
md_bbox_processor[0][0]


encoded_detections (Concatenate (None, 4500, 14) 0 encoded_bg[0][0]
encoded_md[0][0]

Total params: 5,884,944
Trainable params: 5,878,736
Non-trainable params: 6,208


WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo_v3/utils/tensor_utils.py:7: The name tf.local_variables_initializer is deprecated. Please use tf.compat.v1.local_variables_initializer instead.

2021-12-03 13:10:55,560 [WARNING] tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo_v3/utils/tensor_utils.py:7: The name tf.local_variables_initializer is deprecated. Please use tf.compat.v1.local_variables_initializer instead.

WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo_v3/utils/tensor_utils.py:8: The name tf.tables_initializer is deprecated. Please use tf.compat.v1.tables_initializer instead.

2021-12-03 13:10:55,560 [WARNING] tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo_v3/utils/tensor_utils.py:8: The name tf.tables_initializer is deprecated. Please use tf.compat.v1.tables_initializer instead.

WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo_v3/utils/tensor_utils.py:9: The name tf.get_collection is deprecated. Please use tf.compat.v1.get_collection instead.

2021-12-03 13:10:55,560 [WARNING] tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo_v3/utils/tensor_utils.py:9: The name tf.get_collection is deprecated. Please use tf.compat.v1.get_collection instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:986: The name tf.assign_add is deprecated. Please use tf.compat.v1.assign_add instead.

2021-12-03 13:10:57,052 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:986: The name tf.assign_add is deprecated. Please use tf.compat.v1.assign_add instead.

Epoch 1/100
Traceback (most recent call last):
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo_v4/scripts/train.py”, line 110, in
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/utils.py”, line 528, in return_func
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/utils.py”, line 516, in return_func
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo_v4/scripts/train.py”, line 106, in main
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo_v4/scripts/train.py”, line 63, in run_experiment
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo_v4/models/yolov4_model.py”, line 644, in train
File “/usr/local/lib/python3.6/dist-packages/keras/legacy/interfaces.py”, line 91, in wrapper
return func(*args, **kwargs)
File “/usr/local/lib/python3.6/dist-packages/keras/engine/training.py”, line 1418, in fit_generator
initial_epoch=initial_epoch)
File “/usr/local/lib/python3.6/dist-packages/keras/engine/training_generator.py”, line 217, in fit_generator
class_weight=class_weight)
File “/usr/local/lib/python3.6/dist-packages/keras/engine/training.py”, line 1211, in train_on_batch
class_weight=class_weight)
File “/usr/local/lib/python3.6/dist-packages/keras/engine/training.py”, line 751, in _standardize_user_data
exception_prefix=‘input’)
File “/usr/local/lib/python3.6/dist-packages/keras/engine/training_utils.py”, line 138, in standardize_input_data
str(data_shape))
ValueError: Error when checking input: expected Input to have shape (3, 480, 640) but got array with shape (3, 736, 896)
terminate called without an active exception
[6fba8a984de0:00049] *** Process received signal ***
[6fba8a984de0:00049] Signal: Aborted (6)
[6fba8a984de0:00049] Signal code: (-6)
[6fba8a984de0:00049] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x3f040)[0x7fde798dc040]
[6fba8a984de0:00049] [ 1] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0xc7)[0x7fde798dbfb7]
[6fba8a984de0:00049] [ 2] /lib/x86_64-linux-gnu/libc.so.6(abort+0x141)[0x7fde798dd921]
[6fba8a984de0:00049] [ 3] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x8c957)[0x7fde02680957]
[6fba8a984de0:00049] [ 4] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x92ae6)[0x7fde02686ae6]
[6fba8a984de0:00049] [ 5] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x92b21)[0x7fde02686b21]
[6fba8a984de0:00049] [ 6] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(__gxx_personality_v0+0x2da)[0x7fde026864ea]
[6fba8a984de0:00049] [ 7] /lib/x86_64-linux-gnu/libgcc_s.so.1(+0x10668)[0x7fde758a6668]
[6fba8a984de0:00049] [ 8] /lib/x86_64-linux-gnu/libgcc_s.so.1(_Unwind_ForcedUnwind+0x12c)[0x7fde758a6c5c]
[6fba8a984de0:00049] [ 9] /lib/x86_64-linux-gnu/libpthread.so.0(__pthread_unwind+0x40)[0x7fde7968f000]
[6fba8a984de0:00049] [10] /lib/x86_64-linux-gnu/libpthread.so.0(+0x8ae5)[0x7fde79686ae5]
[6fba8a984de0:00049] [11] /lib/x86_64-linux-gnu/libc.so.6(pthread_exit+0x24)[0x7fde799cd394]
[6fba8a984de0:00049] [12] python3.6[0x632045]
[6fba8a984de0:00049] [13] python3.6[0x50baf7]
[6fba8a984de0:00049] [14] python3.6(PyGILState_Ensure+0x47)[0x637e87]
[6fba8a984de0:00049] [15] /usr/local/lib/python3.6/dist-packages/cv2/cv2.cpython-36m-x86_64-linux-gnu.so(+0x12f835)[0x7fddce3a5835]
[6fba8a984de0:00049] [16] /usr/local/lib/python3.6/dist-packages/cv2/cv2.cpython-36m-x86_64-linux-gnu.so(+0x14bb35)[0x7fddce3c1b35]
[6fba8a984de0:00049] [17] /usr/local/lib/python3.6/dist-packages/cv2/cv2.cpython-36m-x86_64-linux-gnu.so(+0x1c54e6)[0x7fddce43b4e6]
[6fba8a984de0:00049] [18] python3.6[0x50a5a5]
[6fba8a984de0:00049] [19] python3.6(_PyEval_EvalFrameDefault+0x444)[0x50bf44]
[6fba8a984de0:00049] [20] python3.6[0x507cd4]
[6fba8a984de0:00049] [21] python3.6[0x509a00]
[6fba8a984de0:00049] [22] python3.6[0x50a3fd]
[6fba8a984de0:00049] [23] python3.6(_PyEval_EvalFrameDefault+0x444)[0x50bf44]
[6fba8a984de0:00049] [24] python3.6[0x5096c8]
[6fba8a984de0:00049] [25] python3.6[0x50a3fd]
[6fba8a984de0:00049] [26] python3.6(_PyEval_EvalFrameDefault+0x444)[0x50bf44]
[6fba8a984de0:00049] [27] python3.6[0x5096c8]
[6fba8a984de0:00049] [28] python3.6[0x50a3fd]
[6fba8a984de0:00049] [29] python3.6(_PyEval_EvalFrameDefault+0x444)[0x50bf44]
[6fba8a984de0:00049] *** End of error message ***
Aborted (core dumped)
2021-12-03 15:11:04,408 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.

Could you share the retraining command and retraining spec file as well? Thanks a lot.

Herewith the retraining command:

Retraining using the pruned model as pretrained weights

!tao yolo_v4_tiny train --gpus 1
-e $SPECS_DIR/yolo_v4_tiny_retrain_kitti_seq.txt
-r $USER_EXPERIMENT_DIR/experiment_dir_retrain
-k $KEY

and the Spec file:

random_seed: 42
yolov4_config {
big_anchor_shape: “[(155.00, 89.00), (113.00, 194.00), (282.00, 232.00)]”
mid_anchor_shape: “[(25.00, 34.00), (43.00, 62.00), (62.00, 119.00)]”
box_matching_iou: 0.5
matching_neutral_box_iou: 0.5
arch: “cspdarknet_tiny”
loss_loc_weight: 1.0
loss_neg_obj_weights: 1.0
loss_class_weights: 1.0
label_smoothing: 0.0
big_grid_xy_extend: 0.05
mid_grid_xy_extend: 0.05
freeze_bn: false
#freeze_blocks: 0
force_relu: false
}
training_config {
batch_size_per_gpu: 8
num_epochs: 100
enable_qat: false
checkpoint_interval: 10
learning_rate {
soft_start_cosine_annealing_schedule {
min_learning_rate: 1e-7
max_learning_rate: 1e-4
soft_start: 0.3
}
}
regularizer {
type: NO_REG
weight: 3e-9
}
optimizer {
adam {
epsilon: 1e-7
beta1: 0.9
beta2: 0.999
amsgrad: false
}
}
pruned_model_path: “/workspace/tao-experiments/yolo_v4_tiny_resized/experiment_dir_pruned/yolov4_cspdarknet_tiny_pruned.tlt”
}
eval_config {
average_precision_mode: SAMPLE
batch_size: 8
matching_iou_threshold: 0.5
}
nms_config {
confidence_threshold: 0.001
clustering_iou_threshold: 0.5
top_k: 200
}
augmentation_config {
hue: 0.1
saturation: 1.5
exposure:1.5
vertical_flip:0
horizontal_flip: 0.5
jitter: 0.3
output_width: 640
output_height: 480
output_channel: 3
randomize_input_shape_period: 10
mosaic_prob: 0.5
mosaic_min_ratio:0.2
}
dataset_config {
data_sources: {
label_directory_path: “/workspace/tao-experiments/data/training/label”
image_directory_path: “/workspace/tao-experiments/data/training/image”
}
include_difficult_in_training: true
target_class_mapping {
key: “vehicle”
value: “vehicle”
}
target_class_mapping {
key: “person”
value: “person”
}
target_class_mapping {
key: “animal”
value: “animal”
}
validation_data_sources: {
label_directory_path: “/workspace/tao-experiments/data/val/label”
image_directory_path: “/workspace/tao-experiments/data/val/image”
}
}

Some more information. If I run inference on my test dataset using the unpruned model via TAO, then it works and my test images are pretty accurately annotated with the inference bounding boxes. I use this command:
tao yolo_v4_tiny inference -i $DATA_DOWNLOAD_DIR/testing/image
-o $USER_EXPERIMENT_DIR/yolo_infer_images
-e $SPECS_DIR/yolo_v4_tiny_train_kitti_seq.txt
-m $USER_EXPERIMENT_DIR/experiment_dir_unpruned/weights/yolov4_cspdarknet_tiny_epoch_$EPOCH.tlt
-l $USER_EXPERIMENT_DIR/yolo_infer_labels
-k $KEY

If I then export the unpruned (similar to what I did previous with the yolov4 arch), and deploy on Jetson nano. I get the following error:
ERROR: [TRT]: [network.cpp::getInput::1589] Error Code 3: Internal Error (Parameter check failed at: optimizer/api/network.cpp::getInput::1589, condition: index < getNbInputs()

Thanks for the info. I will check original issue further.

More, may I know if you ever reproduce this issue with Jupyter notebook(it will train against KITTI public dataset) ?

Morning. No, I haven’t used the sample Jupyter notebook for yolov4-tiny as is yet. I modified it to use the same custom dataset I used for the training of my Yolov4 model where I managed to prune, retrain, export to the Jetson nano. Thanks alot

I cannot reproduce with the sample Jupyter notebook.
Could you please try to train with it? It will train public KITTI dataset.
Or you can ignore notebook, and then change your own dataset to KITTI dataset directly.

Morning Morgan

If I use the sample Jupyter notebook, as is, and train with the public KITTI dataset, I experience the same problems. I then decide to use the tfrecords conversion steps and related spec files, which then works perfectly. It looks like these problems only occurs if I decide to use the “_seg.txt” spec files. We can close this now as I will then be using the tfrecords conversion approach for training.

Thanks for all your assistance

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.