Questions regarding NVIDIA GazeNet's input


I’m working on a gaze detection project and would like to make use of the GazeNet model (Gaze Estimation | NVIDIA NGC).
As input it takes 3 grayscale crops (face, left eye and right) and a facegrid.
It is said that the facegrid should be a binary mask, however i have no idea what they could possibly mean by facegrid. I’ve looked up facegrid on the internet and nothing comes up.

Could anyone help ?

1 Like

I believe the facegrid is a bounding area indicating where the face is in a provided image.
But full disclosure I have not use Gazenet personally.

I found that its often referred to as face grid , not just facegrid , in white papers.

Moved post to NVIDIA GazeNet Facegrid Input @bouchemazakary and @nadeemm.

Hi and thanks for replying :)

I don’t think they’re referring to a bounding box when they say face grid.
In the specification of the inputs, it says that the face grid is 25x25 binary mask and I don’t see how that could possibly represent a bounding box when all you really need is 4 coordinates points.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.