Deepstream Pose Estimation is not able to predict joints for individual person

Description

I was trying Deepstream Pose Estimation which uses tfpose model. (GitHub - NVIDIA-AI-IOT/deepstream_pose_estimation: This is a sample DeepStream application to demonstrate a human pose estimation pipeline.).

I am getting a very good results for single and dual person in a frame but as soon as the 3rd person comes in the frame, it mess up the 2nd person skeleton. The issue seems to be while assigning the joints to the individuals. Its not able to determine which joint belongs to which person. In the images attached below, you can see that for 2 person it works well but as soon as the 3rd person comes into the picture, the joints of the 2nd person scatter across other persons in the frame.

Q1 . Is there a way to get the joins properly on the individual in case of more than 2 people?

Environment

TensorRT Version : 7.0.0.11
GPU Type : GTX 1650
Nvidia Driver Version : 460.73.01
CUDA Version : 10.2
CUDNN Version : -
Operating System + Version : Ubuntu 18.04
Python Version (if applicable) : 3.6.9
TensorFlow Version (if applicable) : -
PyTorch Version (if applicable) : -
Baremetal or Container (if container which image + tag) : Deepstream 5.0/5.1 container

Relevant Files

Reference Results:

Steps To Reproduce

Follow the steps from here to reproduce the result.

will check and get back to you

Hi @aniket.manoj. You can try using PeopleNet or any other object detector with a “person” class as PGIE and use the pose estimation model as SGIE, i.e, perform inference on the crops of the person detection.

Hi @kn1ght. I can try any person detector to crop the person and pass each person as input the the pose estimation model. But It will be again single person pose estimation correct?

That’s correct. But it will circumvent your issue of not being able to get results for all people in a frame. Your problem will be more significant with multiple people in the frame.

That’s correct. But adding second model will take a toll on the FPS. Also if there is a scenario where 2 or more person overlaps, then the person detector model might crop a frame with multiple faces or body parts. In this case there is a high chance that the model might mix the joints of multiple person.

When people overlap, you will have a problem irrespective of using a person detector or not. :)
It’s pretty much a trade-off between how accurate you want to be and how fast you want to run. In my experiment, I have seen significantly better accuracy when clubbed with a detector.

cool! I will try adding a detector once and will let you know how things goes. Thanks

@kn1ght, I can see that the model works on multi-person, Only issue i see is while mapping the joints. I would like to know, how the model is deciding that which joint belongs to which person. May be that will help me figure out the way to organise the skeleton for multi-person.

Hi @aniket.manoj ,
Is it possible to share the test video ?

Hi @mchi ,

Please get the test video from here.

Sorry! I mean the source file so that I can test with the same video on my side.

Thanks!

Sorry! Here is the source video.

Hello @mchi,

Were you able to find any solution? How did it go at your end?

hello @mchi,

Did you find any solution?