Firstly I trained the model with two classes, sitting and standing postures. The average precisions are very good, above 90%.
For the second model, I trained for three classes, sitting posture, standing posture and staff. The average precision is bad like
sit 46.509%
staff 66.269%
stand 66.87%
Training dataset has 1000 images each for each class.
You can view trained images in the following links.
Sit is labelled for the whole sitting posture, standing also the whole standing posture and staff is labelled upper torso part with blue shirt color (not to confuse with standing posture).
May I know why you set “staff” as the third class? I am afraid some of the “staff” images may be also similar to “sit” class.
Below is an example of “staff” class.
But shirt color is always blue for staff.
Below part of sitting may be blocked if they are behind the table. We can’t avoid.
For staff, I relied to shirt color.
I need to know staff so that I can ignore them. They are everywhere in the room. I need to ignore using shirt color. They wear uniform with blue top and black trouser. But I am worried for the confusion with standing class, so I just label upper torso.
I need to know standing so that I can do something for them to help.
But staff are always standing so first model detect them as standing.
Then so staff needs to be ignored and included as third class.
For your case, you can use two models.
The 1st model is an object detection model you already train. It will detect the standing objects or sitting objects.
The secondary model can be a classification model. You can prepare dataset and then use TAO classification network to train. During dataset preparation, please find and crop staff(with blue shirt) , then copy them to one folder named “staff”. Other persons(without blue shirt) are put into another folder named “not-staff”.