Implementation of skeleton-based action recognition model in deepstream

• Hardware Platform (Jetson / GPU) GPU 3060 Ti
• DeepStream Version 6.2
• TensorRT Version
I extended the sample deepstream application for pose estimation (GitHub - NVIDIA-AI-IOT/deepstream_pose_estimation: This is a sample DeepStream application to demonstrate a human pose estimation pipeline.) to work for multiple streams of input and added nvtracker plugin after nvinfer element (pgie) by creating objects for each poses and attaching bounding box information (from estimated pose) to corresponding object and finally attaching these objects to corresponding frame. Then I attach the pose information as user metadata on corresponding object to use it after tracking element. Now, I can get tracker id and pose information after nvtracker.

Now I want to implement an action classifier in deepstream as sgie, which takes an input sequence of these poses with same tracker id and outputs the action class. The input is of size 48x34, where 34 represents (x,y) coordinate of 17 skeleton joints concatenated along the length to become 34 (17x2). 48 is sequence size, 48 poses of certain track id.

How can I do this task? Is it achievable? It would be great if has sample example related/similar to this task. I tried to understand preprocess sample example and deepstream-3d-action-recognition example to do this task. But still couldn’t understand it yet. I need to implement this for my masters thesis, and deadline is near. I would really appreciate if you give at least some ideas.

Thank you.

The deepstream-3d-action-recognition is the right sample. The suggestion is to use nvdspreprocess to generate the 48x34 tensor data and enable “input-tensor-from-meta” of nvinfer to get the correct input data. The nvdspreprocess custom library should be implemented by yourself. Gst-nvdspreprocess (Alpha) — DeepStream 6.2 Release documentation

It means I have to change the Gst-nvdspreprocess plugin source code or is it possible to do using Custom library Interfaces of this plugin (custom_tensor_function, custom_transform) like in deepstream-3d-action-recognition?

This may work for your case.

