I’m working on a gaze detection project and would like to make use of the GazeNet model (Gaze Estimation | NVIDIA NGC).
As input it takes 3 grayscale crops (face, left eye and right) and a facegrid.
It is said that the facegrid should be a binary mask, however i have no idea what they could possibly mean by facegrid. I’ve looked up facegrid on the internet and nothing comes up.
I don’t think they’re referring to a bounding box when they say face grid.
In the specification of the inputs, it says that the face grid is 25x25 binary mask and I don’t see how that could possibly represent a bounding box when all you really need is 4 coordinates points.