TAO PoseClassification and training data

ganmobar · February 9, 2023, 7:55pm

Hi, this post is mostly to clarify some questions I have about retraining the PoseClassification to classify new poses. Right now I am working with three new poses: squatting, fighting and lying down. So, my question is about the input data: “number of sequences”, “maximum sequence length in frames”, and “number of persons”.

In “number of sequences”, this variable represents the amount of elements in my training data. I’m not sure if these sequences refer to sequences in a video, for example, a video of 50 consecutive frames where the same person keeps changing position and constantly changing the skeleton, or if they can also be images without any connection between them and therefore the skeletons in the training data have no connection to each other.
In “maximum sequence length in frames”, it is similar to the previous question. In the documentation, it says “which is 300 (10 seconds for 30 FPS) in the NGC model”, and the documentation also provides the training data, which is a numpy array with a shape of (9441, 3, 300, 34, 1). So, does the first 300 elements correspond to the same person’s skeleton?
In “number of persons”, if I use two videos of 50 frames each, and they have different amounts of persons, how do I manage this? Should there be two files, i.e. (50, 3, 300, 34, 2) and (50, 3, 300, 34, 1)?

I am aware that some of my questions may seem trivial, but I need to be sure of the answers in order to proceed with the development. Any help is appreciated, and thank you in advance!

Morganh · February 10, 2023, 4:01pm

The input layout is NCTVM , where N is the batch size, C is the number of input channels, T is the sequence length, V is the number of keypoints, and M is the number of people.

The sequences stands for consecutive sequences.
See Pose Classification | NVIDIA NGC, if it is a shape of (9441, 3, 300, 34, 1),it means that the maximum sequence length is 300. The number of persons is 1. Yes, the 300 elements correspond to the same person’s skeleton.
Yes, one video can set to (50, 3, 300, 34, 2) and another can set to (50, 3, 300, 34, 1).

ganmobar · February 13, 2023, 5:06pm

Thank you for your prompt response. I have one more question: if I want to use the same video with varying numbers of skeletons per frame, such as a video of a group of people doing a workout, do I need to generate different sets of training data with different numbers of skeletons per frame? Then, would I need to train the model in each distinct dataset? For example, by first processing the video 3 times, resulting in (50,3,300,34,1), (50,3,300,34,2), (50,3,300,34,3) and then training the model in each one? Or is there a different way to approach this?

Morganh · February 15, 2023, 6:04am

It is not needed. The model can support training with multiple people.
More, let me clarify the comment previously.

Each sequence has a maximum length (T). For a sequence that is longer than T, it needs to be broken into multiple short sequences to feed into the model. The model will return the predicted action for each short sequence.

N is the maximum number of sequences that the GPU can process in parallel at a time.

ganmobar · February 15, 2023, 5:40pm

Now everything’s is clear, thanks you

system · March 1, 2023, 5:40pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Training PoseClassificationNet Model TAO Toolkit tensorflow , python , tao , deepstream , nvidia-technologies , jetson-orin	3	51	December 16, 2024
Training PoseclassificationNet TAO Toolkit	9	408	October 17, 2023
Pose Classification with thermal images TAO Toolkit	2	404	April 12, 2023
Training data preparation for PoseClassificationNet TAO Toolkit	13	523	October 10, 2023
PoseClassificationNet Model training on custom dataset TAO Toolkit	2	372	November 16, 2023
Mispredictions with Custom trained PoseClassificationNet when integrated with DeepStream Pose Classification DeepStream SDK deepstream	4	302	February 5, 2024
Training PoseClassificationNet for custom dataset TAO Toolkit	3	293	October 5, 2023
Integration of Bodypose2d with PoseClassificationNet Models tao	1	42	February 7, 2025
FInetuning of PoseClassificationNet TAO Toolkit tensorrt , ubuntu , gstreamer , tao , deepstream	3	36	March 8, 2025
DeepStream Pose Classification predictions are highly biased towards Walking TAO Toolkit	9	660	December 18, 2023

TAO PoseClassification and training data

Related topics