Hi, this is my first post in this forum and I would like to ask you for an opinion on one of my ideas. (And I’m not the best in English, so please forgive any mistakes :) )
So, the project, which I am already planning a bit is about recognizing gestures so that I can control my room, maybe later my house.
I try to set myself several smaller tasks which are also solvable in the foreseeable future.
So I would start with a small person recognition which recognizes persons and sends this picture to the next program. This program should then simply cut out the person, convert it into a standard format and send it on.
From there a hand recognition should be done. If a hand is recognized, we go on to the next step, otherwise we wait for the next image/frame. Here the hand should be cut out again (into a rectangle format). So that the next program can recognize a gesture more easily.
On the basis of the recognized gesture any information can be executed…
The pictures should therefore always be cut so that I can save myself a bit of performance with the networks. Furthermore I hope to increase the image recognition rate.
To my question:
Is this possible in this way as I imagine it or would I need more computing power for it? (Currently I own a Jetson Nano)
Picture about the process flow chart: