Face Embeddingd for FaceNet Face Recognition DeepStream app

Please provide complete information as applicable to your setup.

NVIDIA Jetson Xavier NX
DeepStream Version5.0
Jetpack 4.4.1 [L4T 32.4.4]
TensorRT: 7.1.3.0
CUDA: 10.2.89
cuDNN: 8.0.0.180
Visionworks: 1.6.0.501
OpenCV: 4.1.1 compiled CUDA: NO
VPI: 0.4.4
Vulkan: 1.2.70

I successfully created a Deepstream Face Recognition app but not fully. I used customized deepstream YOLOV3 as face detector, and Facenet for face recognition using deepstream cpp implementation with an .mp4 video file as an input test file, of which there are bounding boxes being drawn around faces of people in the video.

And i am using dynamic facenet onnx model.

My last part is creating an embedding for dataset, where instead odf just drwaing bbox around faces, it will also indicate the name of that person(recognizing who that person is). How can i implement that? I am stuck on this part.

BeLow is my facenet Deepstream app configuration files and cpp:
*deepstream_infer_tensor_meta_test.cpp (32.9 KB) dstensor_pgie_config.txt (3.5 KB) dstensor_sgie1_config.txt (3.6 KB) dstensor_sgie2_config.txt (3.6 KB) dstensor_sgie3_config.txt (3.7 KB) dstest2_sgie1_config.txt (3.7 KB)

and also below is configuration files from YOLOV3 app incase needed:

config_infer_primary_yoloV3.txt (3.4 KB) deepstream_app_config_yoloV3.txt (3.6 KB) dstest2_pgie_config.txt (3.4 KB) yolov3.cfg (8.1 KB) yoloPlugins.cpp (6.2 KB) yoloPlugins.h (5.7 KB)

I really appreciate your help.

1 Like

do you mean how to draw name of the person around the bbox in the frame?

Yes, exactly that. The name of the person being displayed along the bbox

you can refer to

/opt/nvidia/deepstream/deepstream/sources/apps/sample_apps/deepstream-test2/deepstream_test2_app.c

function:

static GstPadProbeReturn
osd_sink_pad_buffer_probe (GstPad * pad, GstPadProbeInfo * info,
    gpointer u_data)
{
....
}

Is there an option of using pickle or .npz file.

DeepStream does not have customized APIs for them, but you can use the 3rdparty APIs in DeepStream C++/Python code.

how can i permit osd to get embedding from datasets(photos of people to be recognized from the video) and display their names on the screen

For example:

from this image, instead of displaying Face, it says “Obama”, or “Michele Obama”, and so on.

Hi @hirwablaise ,
Sorry! dstest2_sgie1_config.txt is the facenet config,right?
You can just give dstest2_sgie1 a label config, i.e. “labelfile-path=xxx”, for example:

gie config: https://github.com/NVIDIA-AI-IOT/deepstream_tlt_apps/blob/master/configs/frcnn_tlt/pgie_frcnn_tlt_config.txt#L28
and, write the correspding string of the classification id in the label file, e.g. https://github.com/NVIDIA-AI-IOT/deepstream_tlt_apps/blob/master/configs/frcnn_tlt/frcnn_labels.txt

Thanks!

Thank you, but i dont think it is through this way. I have a dataset that has more than 1 sub folder with a person’s name eg:Obama. In each subfolder, there are many photos containing all photos of Obama of every angle, lightning, etc. So how to can create embeddings for these datasets so that facenet will compare them with embeddings received from test video file? Hence putting the name of the person whose face is detected in a frame. Thank you.

I also did python implementation for facenet referring deepstream_test2. for .h264 video file format. Which implementation would be more efficient and simpler between cpp implementation and python implementation?

does each embedding has a unique classification id?

Depends on which language you are better at.
For your such implementaion, DS c++ and DS python should be similar.
But the function DS python supports is less than DS C/C++, if you want to add more function in futhre, c++ may be a better choice.

yes, for example, this the output of facenet for face found in a a frame.

.These embeddings represents all faces variations found in that exact frame.

if it’s, I don’t see why we can’t use label.
As deepstream-test2 output screenshot below with below command, it can embed the car type, color, model on the detected object. What’s the difference between this and yours?

./deepstream-test2-app …/…/…/…/samples/streams/sample_720p.h264

Oh, I thought it doesn’t work with labels, because the app was modified to only recognize faces. And in some frames, there are more than 5 faces. If we use labels, how can i assign unique id on each person face? Thank you.

It has nothing to do with face or car… I mean, they are basing the same logic.

And, as my screenshot, there are many cars in one frame, all of them can be identified and added the labels.

How can i apply that logic to the one i have please? How can i assign different labels to different people in a same frame for 1 class(Face). Seems like i will need nvtracker, but how will i do it? I am using YoloV3 as a detector

I mean, giving it a lable config and a label file like below.
I don’t mean simply copying current label files…

gie config: deepstream_tlt_apps/pgie_frcnn_tlt_config.txt at master · NVIDIA-AI-IOT/deepstream_tlt_apps · GitHub
and, write the correspding string of the classification id in the label file, e.g. deepstream_tlt_apps/frcnn_labels.txt at master · NVIDIA-AI-IOT/deepstream_tlt_apps · GitHub

How do i get classification id for different faces? Sorry i am still newbie to this part of tracking.

After making research and observation, i saw that i need to compare vectors(embeddings from datasets) to embeddings got from facenet for face detected as shown in this image below

I know that the facenet has to compare the distance between database embeddings and facenet embeddings and check for similarities, but how to i apply that to my facenet python app code?

Could you try?

  1. add “labelfile-path=./face_labels.txt” in dstest2_sgie1_config.txt as below

  1. And have some names in face_labels.txt like below, and put face_labels.txt under the folder where dstest2_sgie1_config.txt locates

image

and, check if there is embedding on the bbox