This question is a general one without dependency on the hardware/software setup, so I omit all these details. I am now trying to build a pipeline to accurately classify different actions. Considering the nature of action recognition task which involves spatial-temporal information, the built-in models supported by deepstream are not as good as those state-of-the-art models which couldn’t be used for real-time inference. So I am thinking of combining peoplenet with actionrecognitionnet, which could possibly boost the performance. But the examples given in Nvidia documents about secondary detectors don’t involve treatment of spatial temporal information, so I am wondering whether you could provide some insight into this issue before I start to invest time in it.
Currently there is no component in DeepStream SDK support it. You need to customize and implement the function by yourself.
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.