Suggestions on improving Nvidia 2D Body Pose Estimation Model

goran3 · September 16, 2021, 8:56am

Hello to all.

Our team from Darwin Edge has been evaluating the Nvidia 2D Body Pose Estimation model as part of our product development.

It was recently released as part of the TLT 3.0 (Now TAO)

Our initial bench marking results (published in Towards Data Science - article) seem to suggest that the model we were using before: OpenPifPaf - seems to provide better performance than the Nvidia Body Pose Net model.

We would really appreciate all the recommendations and help on how to improve the model. We are quite keen to use it on Nvidia NX devices.

• Hardware (Nvidia NX)
• Network Type (Body Pose Net)
• TLT Version (v3.0-py3)
bpnet_train_m1_coco_training.yaml (2.8 KB)

Morganh · September 16, 2021, 5:22pm

It is not an apple-to-apple comparison. And also the result mentioned in Hands-on: Optimizing and benchmarking Body Pose Estimation models | by Debmalya Biswas | Towards Data Science does not match the result which is posted in https://developer.nvidia.com/blog/training-optimizing-2d-pose-estimation-model-with-tao-toolkit-part-2 .
In the nv blog, it mentioned that “We use a default size 288×384 in this post.”
Its result for the pruned model is:

After retraining the pruned model with pth 0.2, you can observe an accuracy of 57.5% AP with multiscale inference. Here are the metrics on COCO validation set:

Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.575
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.789
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets= 20 ] = 0.621
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.563
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.603

And also, it said that,

You can expect to see a 7-10% AP increase in the area=medium category when going from 224×320 to 288×384 and an additional 7-10% AP when you choose 320×448.

So, you will expect AP increase when going from 288×384 to 368x368.

For the blog Hands-on: Optimizing and benchmarking Body Pose Estimation models | by Debmalya Biswas | Towards Data Science, its result is based on 368x368.

More, nv blog is using vgg while Hands-on: Optimizing and benchmarking Body Pose Estimation models | by Debmalya Biswas | Towards Data Science is using resnet50 as backbone.

You can try to train according to Training and Optimizing a 2D Pose Estimation Model with NVIDIA TAO Toolkit, Part 1 | NVIDIA Technical Blog or run evaluation with BodyPoseNet | NVIDIA NGC

goran3 · September 17, 2021, 8:40am

Thank you for a really fast reply. We really appreciate it.

Our team will analyze this and we will post here our results when we do a retraining and bench marking.

Topic		Replies	Views
Training and Optimizing a 2D Pose Estimation Model with the NVIDIA Transfer Learning Toolkit, Part 1 Technical Blog	2	746	July 12, 2021
How to preprocess image and postprocess results from BodyPoseNet? TAO Toolkit	1	20	October 12, 2024
Want a basic demo on building application using pre trained model in ngc TAO Toolkit ai-training , docker-machine-learning , tao	5	648	July 18, 2022
Training and Optimizing a 2D Pose Estimation Model with the NVIDIA Transfer Learning Toolkit, Part 2 Technical Blog	1	478	August 4, 2022
Error loading custom TAO model into PoseEstimation3D TAO Toolkit	8	910	March 15, 2023
About bodypose3dnet for retraining and what hardware is supported by this model TAO Toolkit	4	509	July 11, 2022
Can I use tao toolkit to improve the posenet model? (use in dusty-nv inference) TAO Toolkit	2	364	February 22, 2023
Documentation for BodyPose3d Model training TAO Toolkit deepstream , jetson-orin	3	51	January 14, 2025
Not enough documentation about bodypose3dnet for retraining and what hardware is supported by this model TAO Toolkit tensorrt , jetson	6	1133	July 6, 2022
Hand Posenet From NVIDIA-AI-IOT / trt_pose_hand Demo TAO Toolkit	6	956	February 22, 2022

Suggestions on improving Nvidia 2D Body Pose Estimation Model

Related topics