Tfrecord files for retraining BodyPoseNet

• Hardware (NVIDIA GeForce RTX 2060 SUPER)
• Network Type (BodyPoseNet)
• TAO Toolkit Version (3.22.05)

Hello,
I am currently trying to retrain BodyPoseNet on a dataset with 21 keypoints instead of 18.
I created my own dataset under COCO format.
I followed the notebook and it works fine when I use the default COCO dataset and spec files, the problem happens when I try to convert my own dataset to tfrecord format:

!tao bpnet dataset_convert
-m ‘train’
-o $DATA_DIR/tfrecords
–generate_masks
–dataset_spec $DATA_POSE_SPECS_DIR/coco_hand_spec.json

2022-07-19 09:26:52,302 [INFO] root: Registry: [‘nvcr.io’]
2022-07-19 09:26:52,344 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit-tf:v3.22.05-tf1.15.5-py3
2022-07-19 09:26:52,352 [WARNING] tlt.components.docker_handler.docker_handler:
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the “user”:“UID:GID” in the
DockerOptions portion of the “/home/mm/.tao_mounts.json” file. You can obtain your
users UID and GID by using the “id -u” and “id -g” commands on the
terminal.
2022-07-19 07:26:53.011803: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
/usr/local/lib/python3.6/dist-packages/requests/init.py:91: RequestsDependencyWarning: urllib3 (1.26.5) or chardet (3.0.4) doesn’t match a supported version!
RequestsDependencyWarning)
Using TensorFlow backend.
WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/checkpoint_saver_hook.py:25: The name tf.train.CheckpointSaverHook is deprecated. Please use tf.estimator.CheckpointSaverHook instead.

2022-07-19 07:26:55,041 [WARNING] tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/checkpoint_saver_hook.py:25: The name tf.train.CheckpointSaverHook is deprecated. Please use tf.estimator.CheckpointSaverHook instead.

WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
/usr/local/lib/python3.6/dist-packages/requests/init.py:91: RequestsDependencyWarning: urllib3 (1.26.5) or chardet (3.0.4) doesn’t match a supported version!
RequestsDependencyWarning)
Using TensorFlow backend.
WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/checkpoint_saver_hook.py:25: The name tf.train.CheckpointSaverHook is deprecated. Please use tf.estimator.CheckpointSaverHook instead.

2022-07-19 07:26:57,378 [WARNING] tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/checkpoint_saver_hook.py:25: The name tf.train.CheckpointSaverHook is deprecated. Please use tf.estimator.CheckpointSaverHook instead.

loading annotations into memory…
Done (t=5.07s)
creating index…
index created!
0%| | 0/269 [00:00<?, ?it/s]
Traceback (most recent call last):
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/scripts/dataset_convert.py”, line 119, in
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/scripts/dataset_convert.py”, line 111, in main
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/dataio/build_converter.py”, line 51, in build_converter
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/dataio/coco_converter.py”, line 83, in init
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/dataio/coco_dataset.py”, line 69, in init
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/dataio/coco_dataset.py”, line 108, in load_dataset
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/dataio/coco_dataset.py”, line 163, in _parse_annotation
IndexError: list index out of range
Traceback (most recent call last):
File “/usr/local/bin/bpnet”, line 8, in
sys.exit(main())
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/entrypoint/bpnet.py”, line 12, in main
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/common/entrypoint/entrypoint.py”, line 300, in launch_job
AssertionError: Process run failed.
2022-07-19 09:27:03,657 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.

I don’t know where the error might come from.
Here is my coco_hand_spec.json file:

{
“dataset”: “coco_hand”,
“root_directory_path”: “/workspace/tao-experiments/bpnet/data”,
“train_data”: {
“images_root_dir_path”: “train_hand”,
“mask_root_dir_path”: “train_hand_mask”,
“annotation_root_dir_path”: “annotations/act1_coco_annotations_train.json”
},
“test_data”: {
“images_root_dir_path”: “test_hand”,
“mask_root_dir_path”: “test_hand_mask”,
“annotation_root_dir_path”: “annotations/act1_coco_annotations_test.json”
},
“duplicate_data_with_each_person_as_center”: true,
“categories”: [
{
“supercategory”: “person”,
“id”: 1,
“name”: “person”,
“num_joints”: 21,
“keypoints”: [
“wrist”, “thumb_cmc”, “thumb_mcp”, “thumb_ip”, “thumb_tip”,
“index_finger_mcp”, “index_finger_pip”, “index_finger_dip”, “index_finger_tip”, “middle_finger_mcp”,
“middle_finger_pip”, “middle_finger_dip”, “middle_finger_tip”,“ring_finger_mcp”, “ring_finger_pip”,
“ring_finger_dip”, “ring_finger_tip”, “pinky_mcp”,“pinky_pip”,“pinky_dip”,“pinky_tip”
],
“skeleton”: [
[0,1],[1,2],[2,3],[3,4],[0,5],[5,6],[6,7],[7,8],
[9,10],[10,11],[11,12],[13,14],[14,15],[15,16],[17,18],[18,19],[19,20],[0,17],[5,9], [9,13], [13,17]
],
“skeleton_edge_names”: [
[“wrist”, “thumb_cmc”], [“thumb_cmc”, “thumb_mcp”], [“thumb_mcp”, “thumb_ip”],
[“thumb_ip”, “thumb_tip”], [“wrist”, “index_finger_mcp”], [“index_finger_mcp”, “index_finger_pip”],
[“index_finger_pip”, “index_finger_dip”], [“index_finger_dip”, “index_finger_tip”], [“middle_finger_mcp”, “middle_finger_pip”],
[“middle_finger_pip”, “middle_finger_dip”], [“middle_finger_dip”, “middle_finger_tip”], [“ring_finger_mcp”, “ring_finger_pip”],
[“ring_finger_pip”,“ring_finger_dip”], [“ring_finger_dip”,“ring_finger_tip”], [“pinky_mcp”,“pinky_pip”],
[“pinky_pip”,“pinky_dip”], [“pinky_dip”,“pinky_tip”], [“wrist”, “pinky_mcp”],
[“index_finger_mcp”, “middle_finger_mcp”], [“middle_finger_mcp”, “ring_finger_mcp”], [“ring_finger_mcp”, “pinky_mcp”]
]
}
],
“visibility_flags”: {
“value”: {
“visible”: 2,
“occluded”: 1,
“not_labeled”: 0
},
“mapping”: {
“visible”: “visible”,
“occluded”: “occluded”,
“not_labeled”: “not_labeled”
}
},
“data_filtering_params”: {
“min_acceptable_height”: 32,
“min_acceptable_width”: 32,
“min_acceptable_kpts”: 5,
“min_acceptable_interperson_dist_ratio”: 0.3
}
}

Here is my bpnet_21_hand_joints.json:

{
“pose_config_type”: “bpnet_18_joints”,
“categories”: [
{
“supercategory”: “person”,
“id”: 1,
“name”: “person”,
“num_joints”: 21,
“keypoints”: [
“wrist”, “thumb_cmc”, “thumb_mcp”, “thumb_ip”, “thumb_tip”,
“index_finger_mcp”, “index_finger_pip”, “index_finger_dip”, “index_finger_tip”, “middle_finger_mcp”,
“middle_finger_pip”, “middle_finger_dip”, “middle_finger_tip”,“ring_finger_mcp”, “ring_finger_pip”,
“ring_finger_dip”, “ring_finger_tip”, “pinky_mcp”,“pinky_pip”,“pinky_dip”,“pinky_tip”
],
“skeleton”: [
[0,1],[1,2],[2,3],[3,4],[0,5],[5,6],[6,7],[7,8],
[9,10],[10,11],[11,12],[13,14],[14,15],[15,16],[17,18],[18,19],[19,20],[0,17],[5,9], [9,13], [13,17]
],
“skeleton_edge_names”: [
[“wrist”, “thumb_cmc”], [“thumb_cmc”, “thumb_mcp”], [“thumb_mcp”, “thumb_ip”],
[“thumb_ip”, “thumb_tip”], [“wrist”, “index_finger_mcp”], [“index_finger_mcp”, “index_finger_pip”],
[“index_finger_pip”, “index_finger_dip”], [“index_finger_dip”, “index_finger_tip”], [“middle_finger_mcp”, “middle_finger_pip”],
[“middle_finger_pip”, “middle_finger_dip”], [“middle_finger_dip”, “middle_finger_tip”], [“ring_finger_mcp”, “ring_finger_pip”],
[“ring_finger_pip”,“ring_finger_dip”], [“ring_finger_dip”,“ring_finger_tip”], [“pinky_mcp”,“pinky_pip”],
[“pinky_pip”,“pinky_dip”], [“pinky_dip”,“pinky_tip”], [“wrist”, “pinky_mcp”],
[“index_finger_mcp”, “middle_finger_mcp”], [“middle_finger_mcp”, “ring_finger_mcp”], [“ring_finger_mcp”, “pinky_mcp”]
]
}
]
}

Tell me if you need any more information, thanks in advance.

Here is my annotation train file:
act1_coco_annotations_train.json (1.5 MB)

Can you update your json file? You can leverage the coco_spec.json in the notebook.

Hello MorganH,

I don’t really understand what you mean by “updating my json file”.
I already your command with the default coco_spec.json and it works perfectly. However when I try it with my own json file (coco_hand_spec.json that you can see in my previous message) it doesn’t work and the logs don’t tell where the error might come from.
Can you be a bit more clear on what I could do?

Thanks for your help and time.

Actually I want to reproduce with your json file.
But meet

File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/driveix/build_wheel.runfiles/ai_infra/driveix/bpnet/dataio/build_converter.py”, line 42, in build_converter
KeyError: ‘root_directory_path’

So, I find that your json file does not contain this path but default json has.
So, could you please double check your json file?

I checked my coco_hand_spec.json and it does have the “root_directory_path” :“/workspace/tao-experiments/bpnet/data” at line 3. I find it weird though as a KeyError generally means that the key doesn’t exist.

My bad. I used the wrong file. Will check further.

Hi,
I can reproduce the error with your coco_hand_spec.json
bpnet dataset_convert -m 'train' -o ./tfrecords -d coco_hand_spec.json --generate_masks

The issue is root-caused.
Your act1_coco_annotations_train.json has a wrong format for bbox.

See Data Annotation Format — TAO Toolkit 3.22.05 documentation

The bbox should looks as below.

"bbox": [115.16,152.13,83.23,228.41],

Indeed this was the problem.
Thanks for you help Morgan and have a good weekend!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.