Facial Landmark Estimator (FPENet) annotation guidelines

bassel.musharrafieh · May 4, 2022, 8:05am

I would like to know if there is a certain guideline in addition to the one specified in the FPENet page in nvidia catalog https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tao/models/fpenet

In the overview section of the FPENet page, the landmarks are stated in the picture and numbered. Some points are clear how they should be annotated but some of the points are ambiguous especially the additional landmark points (81-104) and the pupil landmarks point (69-76). For example, points 60 - 68 how exactly they should be annotated is there a specific place to annotate these points and for the case if the mouth is closed should these point be overlapping or what is the case ?
For landmark points in the eyes there a lot of overlapping points that are unclear. For instance, the points labeling the pupils and for the additional eye landmark it is not very clear how they should be distributed inside the eye. For this reason, I would like to ask you if there is a more documented 104 keypoint landmark annotation guideline in order for the annotator to follow when labeling data

Morganh · May 4, 2022, 2:13pm

There is not more detailed info about the keypoint. Please still refer to its model card.

This model predicts 68, 80 or 104 keypoints for a given face- Chin: 1-17, Eyebrows: 18-27, Nose: 28-36, Eyes: 37-48, Mouth: 49-61, Inner Lips: 62-68, Pupil: 69-76, Ears: 77-80, additional eye landmarks: 81-104.

TRAINING DATA AND GROUND-TRUTH LABELING GUIDELINES

A pre-trained ( trainable ) model is available, trained on a combination of NVIDIA internal dataset and Multi-PIE dataset. NVIDIA internal data has approximately 500k images and Multipie has 750k images.

The ground truth dataset is created by labeling ground-truth facial keypoints by human labellers.

If you are looking to re-train with your own dataset, please follow the guideline below.

Label the keypoints in the correct order as accuractely as possible. The human labeler would be able to zoom in to a face region to correctly localize the keypoint.
For keypoints that are not easily distinguishable such as chin or nose, the best estimate should be made by the human labeler. Some keypoints are easily distinguishable such as mouth corners or eye corners.
Label a keypoint as “occluded” if the keypoint is not visible due to an external object or due to extreme head pose angles. A keypoint is considered occluded when the keypoint is in the image but not visible.
To reduce discrepency in labeling between multiple human labelers, the same keypoint ordering and instructions should be used across labelers. An independent human labeler may be used to test the quality of the annotated landmarks and potential corrections.

Face bounding boxes labeling:

Face bounding boxes should be as tight as possible.
Label each face bounding box with an occlusion level ranging from 0 to 9. 0 means the face is fully visible and 9 means the face is 90% or more occluded. For training, only faces with occlusion level 0-5 are considered.
The datasets consist of webcam images so truncation is rarely seen. If faces are at the edge of the frame with visibility less than 60% due to truncation, this image is dropped from the dataset.

The Sloth and Label-Studio tools have been utilized for labeling.

yingliu · May 17, 2022, 5:51am

There is no update from you for a period, assuming this is not an issue any more.
Hence we are closing this topic. If need further support, please open a new one.
Thanks

Topic		Replies	Views
How are occluded points, face bounding boxes and tfrecord generation handled in Fpenet custom training? Very poor custom retraining results TAO Toolkit	4	564	December 1, 2022
Facial_KeyPoint training using TLT TAO Toolkit	8	675	December 15, 2021
How to generate inference_sample.json file and the bbox annotations for fpenet? TAO Toolkit tao	5	577	May 5, 2023
Label information required for Gazenet training TAO Toolkit	1	457	August 6, 2021
Could you make the face Landmarks Estimation model more general? TAO Toolkit	7	769	July 6, 2022
Training fpenet from scratch on tao for one keypoint TAO Toolkit deepstream	13	264	January 17, 2025
How i can predict 104 landmark points in the Facial landmarks estimation model? DeepStream SDK	6	843	May 20, 2022
How to use Facial Landmark model (FPEnet) in python? TAO Toolkit	1	523	February 9, 2024
Fpenet custom dataset train failed TAO Toolkit	16	913	October 31, 2022
Face landmarks (fpenet) DeepStream SDK	2	820	June 20, 2022

Facial Landmark Estimator (FPENet) annotation guidelines

TRAINING DATA AND GROUND-TRUTH LABELING GUIDELINES

Related topics