Bpnet dataset_convert error in tao

• Hardware (3090)
• Network Type (bpnet)
• TLT Version (3.22.05 )
I am trying to retrain BodyPoseNet on a pig dataset with 27 keypoints instead of 18.
The problem happens when I try to convert my own dataset to tfrecord format.
command:

!tao bpnet dataset_convert \
        -m 'train' \
        -o $DATA_DIR/train \
        --generate_masks \
        --dataset_spec $DATA_POSE_SPECS_DIR/coco_spec_pig_27.json

error:

2022-10-19 10:14:15,316 [INFO] root: Registry: ['nvcr.io']
2022-10-19 10:14:15,358 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit-tf:v3.22.05-tf1.15.5-py3
2022-10-19 10:14:15,436 [WARNING] tlt.components.docker_handler.docker_handler: 
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/home/nxin/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
2022-10-19 02:14:16.023172: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
/usr/local/lib/python3.6/dist-packages/requests/__init__.py:91: RequestsDependencyWarning: urllib3 (1.26.5) or chardet (3.0.4) doesn't match a supported version!
  RequestsDependencyWarning)
Using TensorFlow backend.
WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/checkpoint_saver_hook.py:25: The name tf.train.CheckpointSaverHook is deprecated. Please use tf.estimator.CheckpointSaverHook instead.

2022-10-19 02:14:17,802 [WARNING] tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/checkpoint_saver_hook.py:25: The name tf.train.CheckpointSaverHook is deprecated. Please use tf.estimator.CheckpointSaverHook instead.

WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
/usr/local/lib/python3.6/dist-packages/requests/__init__.py:91: RequestsDependencyWarning: urllib3 (1.26.5) or chardet (3.0.4) doesn't match a supported version!
  RequestsDependencyWarning)
Using TensorFlow backend.
WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/checkpoint_saver_hook.py:25: The name tf.train.CheckpointSaverHook is deprecated. Please use tf.estimator.CheckpointSaverHook instead.

2022-10-19 02:14:19,962 [WARNING] tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/checkpoint_saver_hook.py:25: The name tf.train.CheckpointSaverHook is deprecated. Please use tf.estimator.CheckpointSaverHook instead.

Traceback (most recent call last):
  File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/scripts/dataset_convert.py", line 119, in <module>
  File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/scripts/dataset_convert.py", line 111, in main
  File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/dataio/build_converter.py", line 51, in build_converter
  File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/dataio/coco_converter.py", line 83, in __init__
  File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/dataio/coco_dataset.py", line 42, in __init__
  File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/dataio/coco_dataset.py", line 130, in _get_category
IndexError: list index out of range
Traceback (most recent call last):
  File "/usr/local/bin/bpnet", line 8, in <module>
    sys.exit(main())
  File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/entrypoint/bpnet.py", line 12, in main
  File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/common/entrypoint/entrypoint.py", line 300, in launch_job
AssertionError: Process run failed.
2022-10-19 10:14:20,584 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.

file coco_spec_pig_27.json:

{
    "dataset": "coco_pig_point",
    "root_directory_path": "/workspace/tao-experiments/bpnet/data",
    "train_data": {
        "images_root_dir_path": "train2017",
        "mask_root_dir_path": "train_mask2017",
        "annotation_root_dir_path": "annotations/train.json"
    },
    "test_data": {
        "images_root_dir_path": "val2017",
        "mask_root_dir_path": "val_mask2017",
        "annotation_root_dir_path": "annotations/val.json"
    },
    "duplicate_data_with_each_person_as_center": true,
    "categories": [
        {
            "supercategory": "animal",
            "id": 0,
            "name": "pig",
            "num_joints": 27,
            "keypoints": [
                "pig_point_1", "pig_point_2", "pig_point_3", "pig_point_4", "pig_point_5",
                "pig_point_6", "pig_point_7", "pig_point_8", "pig_point_9", "pig_point_10",
                "pig_point_11", "pig_point_12", "pig_point_13","pig_point_14", "pig_point_15",
                "pig_point_16", "pig_point_17", "pig_point_18", "pig_point_19", "pig_point_20",
                "pig_point_21", "pig_point_22", "pig_point_23", "pig_point_24", "pig_point_25",
                "pig_point_26", "pig_point_27"
            ],
            "skeleton": [
                [0,1],[0,23],[0,19],[0,5],[0,17],[1,2],[23,24],[19,18],
                [2,20],[18,20],[20,21],[21,3],[3,4],[4,5],[21,15],[15,16],[16,17],[21,6],[21,16],
                [21,22],[24,25],[25,26],[22,7],[7,8],[8,9],[22,11],[11,12],[12,13],[26,7],[26,11],
                [22,10],[10,9],[10,13],[6,22],[14,22],[25,6],[25,14],[24,3],[24,15]

            ],
            "skeleton_edge_names": [
                ["pig_point_1", "pig_point_2"], ["pig_point_1", "pig_point_24"], ["pig_point_1", "pig_point_20"],
                ["pig_point_1", "pig_point_6"], ["pig_point_1", "pig_point_18"], ["pig_point_2", "pig_point_3"],
                ["pig_point_24", "pig_point_25"], ["pig_point_20", "pig_point_19"], ["pig_point_3", "pig_point_21"],
                ["pig_point_19", "pig_point_21"], ["pig_point_21", "pig_point_22"], ["pig_point_22", "pig_point_4"],
                ["pig_point_4", "pig_point_5"], ["pig_point_5", "pig_point_6"], ["pig_point_22", "pig_point_16"],
                ["pig_point_16", "pig_point_17"], ["pig_point_17", "pig_point_18"], ["pig_point_22","pig_point_7"],
                ["pig_point_22","pig_point_15"],["pig_point_22","pig_point_23"],["pig_point_25","pig_point_26"],
                ["pig_point_26","pig_point_27"],["pig_point_23","pig_point_8"],["pig_point_8","pig_point_9"],
                ["pig_point_9","pig_point_10"],["pig_point_23","pig_point_12"],["pig_point_12","pig_point_13"],
                ["pig_point_13","pig_point_14"],["pig_point_27","pig_point_8"],["pig_point_27","pig_point_12"],
                ["pig_point_23","pig_point_11"],["pig_point_11","pig_point_10"],["pig_point_11","pig_point_14"],
                ["pig_point_7","pig_point_23"],["pig_point_15","pig_point_23"],["pig_point_26","pig_point_7"],
                ["pig_point_26","pig_point_15"],["pig_point_25","pig_point_4"],["pig_point_25","pig_point_16"]
            ]
        }
    ],
    "visibility_flags": {
        "value": {
            "visible": 2,
            "occluded": 1,
            "not_labeled": 0
        },
        "mapping": {
            "visible": "visible",
            "occluded": "occluded",
            "not_labeled": "not_labeled"
        }
    },
    "data_filtering_params": {
        "min_acceptable_height": 32,
        "min_acceptable_width": 32,
        "min_acceptable_kpts": 5,
        "min_acceptable_interperson_dist_ratio": 0.3
    }
}

train.json (3.7 MB)

I changed “category_id” 0 to 1 in “train.json” and “coco_spec_pig_27.json”,and then the conversion was successful.

"supercategory": "person",
"id": 1,
"name": "person"

But train is still failed.
command:

!tao bpnet train -e $SPECS_DIR/bpnet_train_m1_coco_pig.yaml \
                 -r $USER_EXPERIMENT_DIR/models/exp_m1_unpruned \
                 -k $KEY \
                 --gpus $NUM_GPUS

error:

2022-10-19 11:07:12,025 [INFO] root: Registry: ['nvcr.io']
2022-10-19 11:07:12,068 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit-tf:v3.22.05-tf1.15.5-py3
2022-10-19 11:07:12,146 [WARNING] tlt.components.docker_handler.docker_handler: 
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/home/nxin/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
2022-10-19 03:07:12.729592: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
/usr/local/lib/python3.6/dist-packages/requests/__init__.py:91: RequestsDependencyWarning: urllib3 (1.26.5) or chardet (3.0.4) doesn't match a supported version!
  RequestsDependencyWarning)
Using TensorFlow backend.
WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/checkpoint_saver_hook.py:25: The name tf.train.CheckpointSaverHook is deprecated. Please use tf.estimator.CheckpointSaverHook instead.

2022-10-19 03:07:14,540 [WARNING] tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/checkpoint_saver_hook.py:25: The name tf.train.CheckpointSaverHook is deprecated. Please use tf.estimator.CheckpointSaverHook instead.

WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
/usr/local/lib/python3.6/dist-packages/requests/__init__.py:91: RequestsDependencyWarning: urllib3 (1.26.5) or chardet (3.0.4) doesn't match a supported version!
  RequestsDependencyWarning)
Using TensorFlow backend.
WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/checkpoint_saver_hook.py:25: The name tf.train.CheckpointSaverHook is deprecated. Please use tf.estimator.CheckpointSaverHook instead.

2022-10-19 03:07:16,864 [WARNING] tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/checkpoint_saver_hook.py:25: The name tf.train.CheckpointSaverHook is deprecated. Please use tf.estimator.CheckpointSaverHook instead.

WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/scripts/train.py:91: The name tf.logging.set_verbosity is deprecated. Please use tf.compat.v1.logging.set_verbosity instead.

WARNING 2022-10-19 03:07:16,865| tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/scripts/train.py:91: The name tf.logging.set_verbosity is deprecated. Please use tf.compat.v1.logging.set_verbosity instead.

WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/scripts/train.py:91: The name tf.logging.INFO is deprecated. Please use tf.compat.v1.logging.INFO instead.

WARNING 2022-10-19 03:07:16,865| tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/scripts/train.py:91: The name tf.logging.INFO is deprecated. Please use tf.compat.v1.logging.INFO instead.

/workspace/tao-experiments/bpnet/models/exp_m1_unpruned
/usr/local/lib/python3.6/dist-packages/requests/__init__.py:91: RequestsDependencyWarning: urllib3 (1.26.5) or chardet (3.0.4) doesn't match a supported version!
  RequestsDependencyWarning)
Using TensorFlow backend.
WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/checkpoint_saver_hook.py:25: The name tf.train.CheckpointSaverHook is deprecated. Please use tf.estimator.CheckpointSaverHook instead.

2022-10-19 03:07:16,877 [WARNING] tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/checkpoint_saver_hook.py:25: The name tf.train.CheckpointSaverHook is deprecated. Please use tf.estimator.CheckpointSaverHook instead.

WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/scripts/train.py:91: The name tf.logging.set_verbosity is deprecated. Please use tf.compat.v1.logging.set_verbosity instead.

WARNING 2022-10-19 03:07:16,877| tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/scripts/train.py:91: The name tf.logging.set_verbosity is deprecated. Please use tf.compat.v1.logging.set_verbosity instead.

WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/scripts/train.py:91: The name tf.logging.INFO is deprecated. Please use tf.compat.v1.logging.INFO instead.

WARNING 2022-10-19 03:07:16,877| tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/scripts/train.py:91: The name tf.logging.INFO is deprecated. Please use tf.compat.v1.logging.INFO instead.

/workspace/tao-experiments/bpnet/models/exp_m1_unpruned
/usr/local/lib/python3.6/dist-packages/driveix/bpnet/scripts/train.py:110: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/dataloaders/bpnet_dataloader.py:484: The name tf.FixedLenFeature is deprecated. Please use tf.io.FixedLenFeature instead.

WARNING 2022-10-19 03:07:17,240| tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/dataloaders/bpnet_dataloader.py:484: The name tf.FixedLenFeature is deprecated. Please use tf.io.FixedLenFeature instead.

INFO    2022-10-19 03:07:17,244| __main__: done
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:153: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.

WARNING 2022-10-19 03:07:17,244| tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:153: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.

/usr/local/lib/python3.6/dist-packages/driveix/bpnet/scripts/train.py:110: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/dataloaders/bpnet_dataloader.py:484: The name tf.FixedLenFeature is deprecated. Please use tf.io.FixedLenFeature instead.

WARNING 2022-10-19 03:07:17,278| tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/dataloaders/bpnet_dataloader.py:484: The name tf.FixedLenFeature is deprecated. Please use tf.io.FixedLenFeature instead.

/workspace/tao-experiments/bpnet/data/train-fold-000-of-001: 3852
Total Samples: 3852
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:153: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.

WARNING 2022-10-19 03:07:17,283| tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:153: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.

WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/dataloaders/bpnet_dataloader.py:319: The name tf.matrix_inverse is deprecated. Please use tf.linalg.inv instead.

WARNING 2022-10-19 03:07:17,306| tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/dataloaders/bpnet_dataloader.py:319: The name tf.matrix_inverse is deprecated. Please use tf.linalg.inv instead.

WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/dataloaders/bpnet_dataloader.py:224: The name tf.image.resize_images is deprecated. Please use tf.image.resize instead.

WARNING 2022-10-19 03:07:17,314| tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/dataloaders/bpnet_dataloader.py:224: The name tf.image.resize_images is deprecated. Please use tf.image.resize instead.

/workspace/tao-experiments/bpnet/data/train-fold-000-of-001: 3852
Total Samples: 3852
WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/dataloaders/bpnet_dataloader.py:319: The name tf.matrix_inverse is deprecated. Please use tf.linalg.inv instead.

WARNING 2022-10-19 03:07:17,341| tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/dataloaders/bpnet_dataloader.py:319: The name tf.matrix_inverse is deprecated. Please use tf.linalg.inv instead.

WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/dataloaders/bpnet_dataloader.py:224: The name tf.image.resize_images is deprecated. Please use tf.image.resize instead.

WARNING 2022-10-19 03:07:17,348| tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/dataloaders/bpnet_dataloader.py:224: The name tf.image.resize_images is deprecated. Please use tf.image.resize instead.

INFO    2022-10-19 03:07:17,504| driveix.bpnet.trainers.bpnet_trainer: Building model graph from model defintion ...
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4138: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.

WARNING 2022-10-19 03:07:17,505| tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4138: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:1834: The name tf.nn.fused_batch_norm is deprecated. Please use tf.compat.v1.nn.fused_batch_norm instead.

WARNING 2022-10-19 03:07:17,517| tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:1834: The name tf.nn.fused_batch_norm is deprecated. Please use tf.compat.v1.nn.fused_batch_norm instead.

INFO    2022-10-19 03:07:17,532| driveix.bpnet.trainers.bpnet_trainer: Building model graph from model defintion ...
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4138: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.

WARNING 2022-10-19 03:07:17,533| tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4138: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:1834: The name tf.nn.fused_batch_norm is deprecated. Please use tf.compat.v1.nn.fused_batch_norm instead.

WARNING 2022-10-19 03:07:17,544| tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:1834: The name tf.nn.fused_batch_norm is deprecated. Please use tf.compat.v1.nn.fused_batch_norm instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4115: The name tf.random_normal is deprecated. Please use tf.random.normal instead.

WARNING 2022-10-19 03:07:17,668| tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4115: The name tf.random_normal is deprecated. Please use tf.random.normal instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4115: The name tf.random_normal is deprecated. Please use tf.random.normal instead.

WARNING 2022-10-19 03:07:17,688| tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4115: The name tf.random_normal is deprecated. Please use tf.random.normal instead.

INFO    2022-10-19 03:07:17,860| driveix.bpnet.trainers.bpnet_trainer: First run ...
INFO    2022-10-19 03:07:17,860| driveix.bpnet.trainers.bpnet_trainer: Intializing model with pre-trained weights /workspace/tao-experiments/bpnet/pretrained_model/bodyposenet_vtrainable_v1.0/model.tlt...
INFO    2022-10-19 03:07:17,872| driveix.bpnet.trainers.bpnet_trainer: First run ...
INFO    2022-10-19 03:07:17,872| driveix.bpnet.trainers.bpnet_trainer: Intializing model with pre-trained weights /workspace/tao-experiments/bpnet/pretrained_model/bodyposenet_vtrainable_v1.0/model.tlt...
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:517: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

WARNING 2022-10-19 03:07:18,314| tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:517: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:517: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

WARNING 2022-10-19 03:07:18,314| tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:517: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:174: The name tf.get_default_session is deprecated. Please use tf.compat.v1.get_default_session instead.

WARNING 2022-10-19 03:07:18,833| tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:174: The name tf.get_default_session is deprecated. Please use tf.compat.v1.get_default_session instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:190: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead.

WARNING 2022-10-19 03:07:18,833| tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:190: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:199: The name tf.is_variable_initialized is deprecated. Please use tf.compat.v1.is_variable_initialized instead.

WARNING 2022-10-19 03:07:18,834| tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:199: The name tf.is_variable_initialized is deprecated. Please use tf.compat.v1.is_variable_initialized instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:174: The name tf.get_default_session is deprecated. Please use tf.compat.v1.get_default_session instead.

WARNING 2022-10-19 03:07:18,853| tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:174: The name tf.get_default_session is deprecated. Please use tf.compat.v1.get_default_session instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:190: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead.

WARNING 2022-10-19 03:07:18,853| tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:190: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:199: The name tf.is_variable_initialized is deprecated. Please use tf.compat.v1.is_variable_initialized instead.

WARNING 2022-10-19 03:07:18,853| tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:199: The name tf.is_variable_initialized is deprecated. Please use tf.compat.v1.is_variable_initialized instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:206: The name tf.variables_initializer is deprecated. Please use tf.compat.v1.variables_initializer instead.

WARNING 2022-10-19 03:07:19,084| tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:206: The name tf.variables_initializer is deprecated. Please use tf.compat.v1.variables_initializer instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:206: The name tf.variables_initializer is deprecated. Please use tf.compat.v1.variables_initializer instead.

WARNING 2022-10-19 03:07:19,114| tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:206: The name tf.variables_initializer is deprecated. Please use tf.compat.v1.variables_initializer instead.

/usr/local/lib/python3.6/dist-packages/keras/engine/saving.py:292: UserWarning: No training configuration found in save file: the model was *not* compiled. Compile it manually.
  warnings.warn('No training configuration found in save file: '
WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/losses/bpnet_loss.py:120: The name tf.summary.scalar is deprecated. Please use tf.compat.v1.summary.scalar instead.

WARNING 2022-10-19 03:07:25,593| tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/losses/bpnet_loss.py:120: The name tf.summary.scalar is deprecated. Please use tf.compat.v1.summary.scalar instead.

/usr/local/lib/python3.6/dist-packages/keras/engine/saving.py:292: UserWarning: No training configuration found in save file: the model was *not* compiled. Compile it manually.
  warnings.warn('No training configuration found in save file: '
WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/losses/bpnet_loss.py:120: The name tf.summary.scalar is deprecated. Please use tf.compat.v1.summary.scalar instead.

WARNING 2022-10-19 03:07:25,822| tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/losses/bpnet_loss.py:120: The name tf.summary.scalar is deprecated. Please use tf.compat.v1.summary.scalar instead.

WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/utils.py:61: The name tf.train.LoggingTensorHook is deprecated. Please use tf.estimator.LoggingTensorHook instead.

WARNING 2022-10-19 03:07:27,238| tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/utils.py:61: The name tf.train.LoggingTensorHook is deprecated. Please use tf.estimator.LoggingTensorHook instead.

WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/utils.py:62: The name tf.train.StopAtStepHook is deprecated. Please use tf.estimator.StopAtStepHook instead.

WARNING 2022-10-19 03:07:27,238| tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/utils.py:62: The name tf.train.StopAtStepHook is deprecated. Please use tf.estimator.StopAtStepHook instead.

WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/trainers/bpnet_trainer.py:300: The name tf.train.NanTensorHook is deprecated. Please use tf.estimator.NanTensorHook instead.

WARNING 2022-10-19 03:07:27,238| tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/trainers/bpnet_trainer.py:300: The name tf.train.NanTensorHook is deprecated. Please use tf.estimator.NanTensorHook instead.

INFO    2022-10-19 03:07:27,469| __main__: training
WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/utils.py:61: The name tf.train.LoggingTensorHook is deprecated. Please use tf.estimator.LoggingTensorHook instead.

WARNING 2022-10-19 03:07:27,471| tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/utils.py:61: The name tf.train.LoggingTensorHook is deprecated. Please use tf.estimator.LoggingTensorHook instead.

WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/utils.py:62: The name tf.train.StopAtStepHook is deprecated. Please use tf.estimator.StopAtStepHook instead.

WARNING 2022-10-19 03:07:27,471| tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/utils.py:62: The name tf.train.StopAtStepHook is deprecated. Please use tf.estimator.StopAtStepHook instead.

WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/utils.py:75: The name tf.train.StepCounterHook is deprecated. Please use tf.estimator.StepCounterHook instead.

WARNING 2022-10-19 03:07:27,472| tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/utils.py:75: The name tf.train.StepCounterHook is deprecated. Please use tf.estimator.StepCounterHook instead.

INFO:tensorflow:Create CheckpointSaverHook.
INFO    2022-10-19 03:07:27,472| tensorflow: Create CheckpointSaverHook.
WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/utils.py:104: The name tf.train.SummarySaverHook is deprecated. Please use tf.estimator.SummarySaverHook instead.

WARNING 2022-10-19 03:07:27,472| tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/utils.py:104: The name tf.train.SummarySaverHook is deprecated. Please use tf.estimator.SummarySaverHook instead.

WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/trainers/bpnet_trainer.py:300: The name tf.train.NanTensorHook is deprecated. Please use tf.estimator.NanTensorHook instead.

WARNING 2022-10-19 03:07:27,472| tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/trainers/bpnet_trainer.py:300: The name tf.train.NanTensorHook is deprecated. Please use tf.estimator.NanTensorHook instead.

INFO:tensorflow:Graph was finalized.
INFO    2022-10-19 03:07:28,276| tensorflow: Graph was finalized.
INFO:tensorflow:Running local_init_op.
INFO    2022-10-19 03:07:29,797| tensorflow: Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO    2022-10-19 03:07:29,930| tensorflow: Done running local_init_op.
INFO:tensorflow:Graph was finalized.
INFO    2022-10-19 03:07:32,104| tensorflow: Graph was finalized.
INFO:tensorflow:Running local_init_op.
INFO    2022-10-19 03:07:33,502| tensorflow: Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO    2022-10-19 03:07:33,606| tensorflow: Done running local_init_op.
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call
    return fn(*args)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1350, in _run_fn
    target_list, run_metadata)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
  (0) Invalid argument: ValueError: not enough values to unpack (expected 3, got 0)
Traceback (most recent call last):

  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/script_ops.py", line 235, in __call__
    ret = func(*args)

  File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/dataloaders/processors/augmentation.py", line 283, in call

  File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/dataloaders/processors/augmentation.py", line 369, in apply_affine_transform_kpts

ValueError: not enough values to unpack (expected 3, got 0)


	 [[{{node DataLoader/PyFunc_14}}]]
  (1) Invalid argument: ValueError: not enough values to unpack (expected 3, got 0)
Traceback (most recent call last):

  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/script_ops.py", line 235, in __call__
    ret = func(*args)

  File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/dataloaders/processors/augmentation.py", line 283, in call

  File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/dataloaders/processors/augmentation.py", line 369, in apply_affine_transform_kpts

ValueError: not enough values to unpack (expected 3, got 0)


	 [[{{node DataLoader/PyFunc_14}}]]
	 [[DataLoader/PyFunc_7/_4779]]
0 successful operations.
0 derived errors ignored.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/scripts/train.py", line 146, in <module>
  File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/scripts/train.py", line 137, in main
  File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/trainers/bpnet_trainer.py", line 316, in train
  File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/core/build_wheel.runfiles/ai_infra/moduluspy/modulus/blocks/trainers/trainer.py", line 119, in run_training_loop
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 754, in run
    run_metadata=run_metadata)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 1360, in run
    raise six.reraise(*original_exc_info)
  File "/usr/local/lib/python3.6/dist-packages/six.py", line 696, in reraise
    raise value
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 1345, in run
    return self._sess.run(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 1418, in run
    run_metadata=run_metadata)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 1176, in run
    return self._sess.run(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 956, in run
    run_metadata_ptr)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1180, in _run
    feed_dict_tensor, options, run_metadata)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1359, in _do_run
    run_metadata)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1384, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
  (0) Invalid argument: ValueError: not enough values to unpack (expected 3, got 0)
Traceback (most recent call last):

  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/script_ops.py", line 235, in __call__
    ret = func(*args)

  File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/dataloaders/processors/augmentation.py", line 283, in call

  File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/dataloaders/processors/augmentation.py", line 369, in apply_affine_transform_kpts

ValueError: not enough values to unpack (expected 3, got 0)


	 [[node DataLoader/PyFunc_14 (defined at usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) ]]
  (1) Invalid argument: ValueError: not enough values to unpack (expected 3, got 0)
Traceback (most recent call last):

  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/script_ops.py", line 235, in __call__
    ret = func(*args)

  File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/dataloaders/processors/augmentation.py", line 283, in call

  File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/dataloaders/processors/augmentation.py", line 369, in apply_affine_transform_kpts

ValueError: not enough values to unpack (expected 3, got 0)


	 [[node DataLoader/PyFunc_14 (defined at usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) ]]
	 [[DataLoader/PyFunc_7/_4779]]
0 successful operations.
0 derived errors ignored.

Original stack trace for 'DataLoader/PyFunc_14':
  File "root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/scripts/train.py", line 146, in <module>
  File "root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/scripts/train.py", line 134, in main
  File "root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/trainers/bpnet_trainer.py", line 255, in build
  File "root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/trainers/bpnet_trainer.py", line 151, in _build_distributed
  File "root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/dataloaders/bpnet_dataloader.py", line 208, in __call__
  File "usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/script_ops.py", line 521, in numpy_function
    return py_func_common(func, inp, Tout, stateful=True, name=name)
  File "usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/script_ops.py", line 495, in py_func_common
    func=func, inp=inp, Tout=Tout, stateful=stateful, eager=False, name=name)
  File "usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/script_ops.py", line 318, in _internal_py_func
    input=inp, token=token, Tout=Tout, name=name)
  File "usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/gen_script_ops.py", line 170, in py_func
    "PyFunc", input=input, token=token, Tout=Tout, name=name)
  File "usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/op_def_library.py", line 794, in _apply_op_helper
    op_def=op_def)
  File "usr/local/lib/python3.6/dist-packages/tensorflow_core/python/util/deprecation.py", line 513, in new_func
    return func(*args, **kwargs)
  File "usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 3357, in create_op
    attrs, op_def, compute_device)
  File "usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 3426, in _create_op_internal
    op_def=op_def)
  File "usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 1748, in __init__
    self._traceback = tf_stack.extract_stack()

--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

  Process name: [[37697,1],1]
  Exit code:    1
--------------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/local/bin/bpnet", line 8, in <module>
    sys.exit(main())
  File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/entrypoint/bpnet.py", line 12, in main
  File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/common/entrypoint/entrypoint.py", line 300, in launch_job
AssertionError: Process run failed.
2022-10-19 11:07:39,157 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.

My spec file:

__class_name__: BpNetTrainer
checkpoint_dir: /workspace/tao-experiments/bpnet/models/exp_m1_unpruned
log_every_n_secs: 30
checkpoint_n_epoch: 5
num_epoch: 50
summary_every_n_steps: 20
infrequent_summary_every_n_steps: 0
validation_every_n_epoch: 5
max_ckpt_to_keep: 100
random_seed: 42
pretrained_weights: /workspace/tao-experiments/bpnet/pretrained_model/bodyposenet_vtrainable_v1.0/model.tlt
load_graph: False
finetuning_config:
  is_finetune_exp: False
  checkpoint_path: null
  ckpt_epoch_num: 0
use_stagewise_lr_multipliers: True
dataloader:
  __class_name__: BpNetDataloader
  batch_size: 10
  pose_config:
    __class_name__: BpNetPoseConfig
    target_shape: [32, 32]
    pose_config_path: /workspace/examples/bpnet/model_pose_config/bpnet_27joints.json
  image_config:
    image_dims:
      height: 256
      width: 256
      channels: 3
    image_encoding: jpg
  dataset_config:
    root_data_path: /workspace/tao-experiments/bpnet/data/
    train_records_folder_path: /workspace/tao-experiments/bpnet/data
    train_records_path: [train-fold-000-of-001]
    val_records_folder_path: /workspace/tao-experiments/bpnet/data
    val_records_path: [val-fold-000-of-001]
    dataset_specs:
      coco: /workspace/examples/bpnet/data_pose_config/coco_spec_pig_27.json
  normalization_params: 
    image_scale: [256.0, 256.0, 256.0]
    image_offset: [0.5, 0.5, 0.5]
    mask_scale: [255.0]
    mask_offset: [0.0]
  augmentation_config:
    __class_name__: AugmentationConfig
    spatial_augmentation_mode: person_centric
    spatial_aug_params:
      flip_lr_prob: 0.5
      flip_tb_prob: 0.0
      rotate_deg_max: 40.0
      rotate_deg_min: -40.0
      zoom_prob: 0.0
      zoom_ratio_min: 1.0
      zoom_ratio_max: 1.0
      translate_max_x: 40.0
      translate_min_x: -40.0
      translate_max_y: 40.0
      translate_min_y: -40.0
      use_translate_ratio: False
      translate_ratio_max: 0.2
      translate_ratio_min: -0.2
      target_person_scale: 0.6
    identity_spatial_aug_params:
      null
  label_processor_config:
    paf_gaussian_sigma: 0.03
    heatmap_gaussian_sigma: 7.0
    paf_ortho_dist_thresh: 1.0
  shuffle_buffer_size: 20000
model:
  __class_name__: BpNetLiteModel
  backbone_attributes:
    architecture: vgg
    mtype: default
    use_bias: False
  stages: 3
  heat_channels: 28
  paf_channels: 78
  use_self_attention: False
  data_format: channels_last
  use_bias: True
  regularization_type: l1
  kernel_regularization_factor: 5.0e-4
  bias_regularization_factor: 0.0
  kernel_initializer: random_normal
optimizer:
  __class_name__: WeightedMomentumOptimizer
  learning_rate_schedule:
    __class_name__: SoftstartAnnealingLearningRateSchedule
    soft_start: 0.05
    annealing: 0.5
    base_learning_rate: 2.e-5
    min_learning_rate: 8.e-08
    last_step: null
  grad_weights_dict: null
  weight_default_value: 1.0
  momentum: 0.9
  use_nesterov: False
loss:
  __class_name__: BpNetLoss

Can i change the output channel “heat_channels” and “paf_channels”?Or it must be 19 and 38?

Currently the bpnet is not compatible with custom skeleton.
For animal dataset, if it matches the default keypoints, it can be working.

Thank you for your answer.
I try to use pig keypoints match default keypoints, it can be working, but result is not very good.
A few weeks ago I saw news from NVIDIA about OneCup’s successful use of TAO Toolkit.news
The keypoints and skeleton of their product look custom. But I tried unsuccessfully, and you said it is not supported yet.onecup AI
Please ask how this is done using TAO Toolkit?

For OneCup, you can have a look at Building BETSY, World's First AI Ranch Hand - YouTube . The product contains 25 different models.

For keypoints, the fpenet-generic is an option. We already support it. For fpenet-generic, users can follow fpenet jupyter notebook to train a model. There is an environment variable NUM_KEYPOINTS. See more info in Facial Landmarks Estimation — TAO Toolkit 3.22.05 documentation . Besides the facial landmarks, key points estimator can also be used to predict the keypoints for general-purpose applications.

Great, thanks for your help.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.