What model to use for face recognition?

• Jetson AGX Xavier
• Deepstream 6.0
• JetPack 4.6
• TensorRT 8.0.1
• NVIDIA GPU Driver 32.6.1

Hello, I going to make a face recognition system.
As I understood I need to use deepstream-infer-tensor-meta-app
For pgie I will use yolov5 (because I need to detect not only faces)
For sgie as I researched best is insightface(arcface) or facenet(triplet loss).
Get tensor meta of face (as I understood tensor meta is embeddings of face, am I right?) and get cosine distance with every face in database

  • I can’t find any deepstream implementation examples, pls, help me to find it.
  • And Where can I get pretrained models for recognition, I mean facenet or insightface. (I know that is not a deepstream question)

demo :deepstream_tao_apps/apps/tao_others/deepstream-faciallandmark-app at master · NVIDIA-AI-IOT/deepstream_tao_apps · GitHub
model: Facial Landmarks Estimation | NVIDIA NGC

Thank you very much. Am i right that face landmarks are not embeddings for recognition?
I need to use facenet output tensor as embeddings for calculation of distance?

please refer to this: Face Embeddingd for FaceNet Face Recognition DeepStream app - #20 by hirwablaise
face landmarks model can give face keypoints.

Thank you :D

Hi, I used code from Face Embeddingd for FaceNet Face Recognition DeepStream app - #20 by hirwablaise

And got this error:

WARNING: Overriding infer-config batch-size (1) with number of sources (3)
Failed to load config file: No such file or directory
** ERROR: <gst_nvinfer_parse_config_file:1303>: failed
Now playing...
0:00:00.386799685  7040   0x5598b20460 WARN                 nvinfer gstnvinfer.cpp:635:gst_nvinfer_logger:<secondary1-nvinference-engine> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::initialize() <nvdsinfer_context_impl.cpp:1161> [UID = 1]: Warning, OpenCV has been deprecated. Using NMS for clustering instead of cv::groupRectangles with topK = 20 and NMS Threshold = 0.5
0:00:05.803056709  7040   0x5598b20460 INFO                 nvinfer gstnvinfer.cpp:638:gst_nvinfer_logger:<secondary1-nvinference-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1900> [UID = 1]: deserialized trt engine from :/home/cv/Desktop/Mask-Detection/Deepstream-app/model.etlt_b1_gpu0_int8.engine
INFO: [Implicit Engine Info]: layers num: 3
0   INPUT  kFLOAT input_1         3x416x736       
1   OUTPUT kFLOAT output_bbox/BiasAdd 4x26x46         
2   OUTPUT kFLOAT output_cov/Sigmoid 1x26x46         

0:00:05.803427671  7040   0x5598b20460 INFO                 nvinfer gstnvinfer.cpp:638:gst_nvinfer_logger:<secondary1-nvinference-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2004> [UID = 1]: Use deserialized engine model: /home/cv/Desktop/Mask-Detection/Deepstream-app/model.etlt_b1_gpu0_int8.engine
0:00:05.817956466  7040   0x5598b20460 INFO                 nvinfer gstnvinfer_impl.cpp:313:notifyLoadModelStatus:<secondary1-nvinference-engine> [UID 1]: Load new model:./dstest2_sgie1_config.txt sucessfully
0:00:05.818636915  7040   0x5598b20460 WARN                 nvinfer gstnvinfer.cpp:794:gst_nvinfer_start:<primary-nvinference-engine> error: Configuration file parsing failed
0:00:05.818735832  7040   0x5598b20460 WARN                 nvinfer gstnvinfer.cpp:794:gst_nvinfer_start:<primary-nvinference-engine> error: Config file path: /opt/nvidia/deepstream/deepstream-5.0/sources/custom_yolo_face/dstest2_pgie_config.txt
Running...
ERROR from element primary-nvinference-engine: Configuration file parsing failed
Error details: /dvs/git/dirty/git-master_linux/deepstream/sdk/src/gst-plugins/gst-nvinfer/gstnvinfer.cpp(794): gst_nvinfer_start (): /GstPipeline:dstensor-pipeline/GstNvInfer:primary-nvinference-engine:
Config file path: /opt/nvidia/deepstream/deepstream-5.0/sources/custom_yolo_face/dstest2_pgie_config.txt
Returned, stopping playback
Deleting pipeline


But I made a few changes:
I used facenet from your link and used this config:

[property]
gpu-id=0
net-scale-factor=0.0039215697906911373
tlt-model-key=nvidia_tlt
tlt-encoded-model=/home/cv/Desktop/Mask-Detection/Deepstream-app/model.etlt
labelfile-path=labels_facenet.txt
int8-calib-file=/home/cv/Desktop/Mask-Detection/Deepstream-app/int8_calibration.txt
model-engine-file=/home/cv/Desktop/Mask-Detection/Deepstream-app/model.etlt_b1_gpu0_int8.engine
infer-dims=3;416;736
uff-input-order=0
uff-input-blob-name=input_1
batch-size=1
process-mode=2
model-color-format=0
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=1
num-detected-classes=1
interval=0
gie-unique-id=1
output-blob-names=output_bbox/BiasAdd;output_cov/Sigmoid

[class-attrs-all]
pre-cluster-threshold=0.2
group-threshold=1
## Set eps=0.7 and minBoxes for cluster-mode=1(DBSCAN)
eps=0.2
#minBoxes=3

For yolo I usd this config:

[source0]
drop-frame-interval=24

[property]
gpu-id=0
net-scale-factor=0.0039215697906911373
model-color-format=0

model-engine-file=/opt/nvidia/deepstream/deepstream-6.0/sources/apps/sample_apps/deepstream-infer-tensor-meta-test/model_b1_gpu0_fp16.engine
model-file=/home/cv/Desktop/Mask-Detection/Deepstream-app/best.wts
custom-network-config=/home/cv/Desktop/Mask-Detection/Deepstream-app/best.cfg
# onnx-file=/home/cv/Desktop/CV.Hermes/deepstream_python_apps/apps/deepstream_video_yolov5/yolov5.onnx
uff-input-dims=3;640;640;0
batch-size=1
gie-unique-id=40001
#output-blob-names=prob
network-mode=2
network-type=0
process-mode=1
num-detected-classes=3

cluster-mode=2
secondary-reinfer-interval=0
force-implicit-batch-dim=1

parse-bbox-func-name=NvDsInferParseYolo
custom-lib-path=/home/cv/Desktop/CV.Hermes/deepstream_apps/nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
labelfile-path=/home/cv/Desktop/CV.Hermes/deepstream_apps/deepstream_video_inference/labels.txt
engine-create-func-name=NvDsInferYoloCudaEngineGet

So what made this error?

WARNING: Overriding infer-config batch-size (1) with number of sources (3)
Failed to load config file: No such file or directory
** ERROR: <gst_nvinfer_parse_config_file:1303>: failed

I didn’t change code from: Face Embeddingd for FaceNet Face Recognition DeepStream app - #20 by hirwablaise

Failed to load config file: No such file or directory
Config file path: /opt/nvidia/deepstream/deepstream-5.0/sources/custom_yolo_face/dstest2_pgie_config.txt
please check the config path.

I used another config and I wrote path right. I understood that in uff-input-dims first number equals number batchsize not number of dimentions. What uff-input-dims means? In documentation for yolo I need to use 3;640;640;0 because I have an RGB 640x640 image

Well I can’t say what did I changed, but now I got whis error

Now playing...
0:00:00.317954582 23234   0x55b4c9c260 WARN                 nvinfer gstnvinfer.cpp:635:gst_nvinfer_logger:<secondary1-nvinference-engine> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::initialize() <nvdsinfer_context_impl.cpp:1161> [UID = 1]: Warning, OpenCV has been deprecated. Using NMS for clustering instead of cv::groupRectangles with topK = 20 and NMS Threshold = 0.5
0:00:04.648032550 23234   0x55b4c9c260 INFO                 nvinfer gstnvinfer.cpp:638:gst_nvinfer_logger:<secondary1-nvinference-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1900> [UID = 1]: deserialized trt engine from :/home/cv/Desktop/Mask-Detection/Deepstream-app/model.etlt_b1_gpu0_int8.engine
INFO: [Implicit Engine Info]: layers num: 3
0   INPUT  kFLOAT input_1         3x416x736       
1   OUTPUT kFLOAT output_bbox/BiasAdd 4x26x46         
2   OUTPUT kFLOAT output_cov/Sigmoid 1x26x46         

0:00:04.648579934 23234   0x55b4c9c260 INFO                 nvinfer gstnvinfer.cpp:638:gst_nvinfer_logger:<secondary1-nvinference-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2004> [UID = 1]: Use deserialized engine model: /home/cv/Desktop/Mask-Detection/Deepstream-app/model.etlt_b1_gpu0_int8.engine
0:00:04.660167696 23234   0x55b4c9c260 INFO                 nvinfer gstnvinfer_impl.cpp:313:notifyLoadModelStatus:<secondary1-nvinference-engine> [UID 1]: Load new model:/opt/nvidia/deepstream/deepstream-6.0/sources/apps/sample_apps/deepstream-infer-tensor-meta-test/dstest2_sgie1_config.txt sucessfully
Deserialize yoloLayer plugin: yolo_93
Deserialize yoloLayer plugin: yolo_96
Deserialize yoloLayer plugin: yolo_99
0:00:04.961636709 23234   0x55b4c9c260 INFO                 nvinfer gstnvinfer.cpp:638:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 40001]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1900> [UID = 40001]: deserialized trt engine from :/opt/nvidia/deepstream/deepstream-6.0/sources/apps/sample_apps/deepstream-infer-tensor-meta-test/model_b1_gpu0_fp16.engine
INFO: [Implicit Engine Info]: layers num: 4
0   INPUT  kFLOAT data            3x640x640       
1   OUTPUT kFLOAT yolo_93         24x80x80        
2   OUTPUT kFLOAT yolo_96         24x40x40        
3   OUTPUT kFLOAT yolo_99         24x20x20        

0:00:04.961864207 23234   0x55b4c9c260 WARN                 nvinfer gstnvinfer.cpp:635:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 40001]: Warning from NvDsInferContextImpl::checkBackendParams() <nvdsinfer_context_impl.cpp:1833> [UID = 40001]: Backend has maxBatchSize 1 whereas 3 has been requested
0:00:04.961913105 23234   0x55b4c9c260 WARN                 nvinfer gstnvinfer.cpp:635:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 40001]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2012> [UID = 40001]: deserialized backend context :/opt/nvidia/deepstream/deepstream-6.0/sources/apps/sample_apps/deepstream-infer-tensor-meta-test/model_b1_gpu0_fp16.engine failed to match config params, trying rebuild
0:00:05.021022300 23234   0x55b4c9c260 INFO                 nvinfer gstnvinfer.cpp:638:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 40001]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1914> [UID = 40001]: Trying to create engine from model files
YOLO config file or weights file is not specified
ERROR: Failed to create network using custom network creation function
ERROR: Failed to get cuda engine from custom library API
0:00:05.022360214 23234   0x55b4c9c260 ERROR                nvinfer gstnvinfer.cpp:632:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 40001]: Error in NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1934> [UID = 40001]: build engine file failed
0:00:05.022434521 23234   0x55b4c9c260 ERROR                nvinfer gstnvinfer.cpp:632:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 40001]: Error in NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2020> [UID = 40001]: build backend context failed
0:00:05.022477883 23234   0x55b4c9c260 ERROR                nvinfer gstnvinfer.cpp:632:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 40001]: Error in NvDsInferContextImpl::initialize() <nvdsinfer_context_impl.cpp:1257> [UID = 40001]: generate backend failed, check config file settings
0:00:05.023088853 23234   0x55b4c9c260 WARN                 nvinfer gstnvinfer.cpp:841:gst_nvinfer_start:<primary-nvinference-engine> error: Failed to create NvDsInferContext instance
0:00:05.023144119 23234   0x55b4c9c260 WARN                 nvinfer gstnvinfer.cpp:841:gst_nvinfer_start:<primary-nvinference-engine> error: Config file path: /opt/nvidia/deepstream/deepstream-6.0/sources/apps/sample_apps/deepstream-infer-tensor-meta-test/dstest2_pgie_config.txt, NvDsInfer Error: NVDSINFER_CONFIG_FAILED
Running...
ERROR from element primary-nvinference-engine: Failed to create NvDsInferContext instance
Error details: /dvs/git/dirty/git-master_linux/deepstream/sdk/src/gst-plugins/gst-nvinfer/gstnvinfer.cpp(841): gst_nvinfer_start (): /GstPipeline:dstensor-pipeline/GstNvInfer:primary-nvinference-engine:
Config file path: /opt/nvidia/deepstream/deepstream-6.0/sources/apps/sample_apps/deepstream-infer-tensor-meta-test/dstest2_pgie_config.txt, NvDsInfer Error: NVDSINFER_CONFIG_FAILED
Returned, stopping playback
Deleting pipeline

As I understood this is YOLOv5 error. I used the same config file in deepstream-python-apps and it worked well. What means uff-input-dimms? As I seen in documentation: channel; height; width; input-order(0: NCHW
1: NHWC)

Well, I changed batch-size to 3. And .engine builded. But I got this error after pipeline run

** (deepstream-infer-tensor-meta-app:25164): ERROR **: 11:16:54.382: failed to check pgie network info

Shhesh, after hours of debugging I ran the pipeline. But I have got a question. So facenet that used in tao-app returns 2 values 46x26x4() bbox coordinate tensor and 46x26x1 class confidence tensor. So what tensor I need to use for recognition?
I know that I need to compare embeddings from my dataset and embeddings that I got, I can use cosine similarity for example. But how can I do this with deepstream? Shure I can do this every frame in sgie_pad_buffer_probe function, but is there easier way to do this?

please refer to deepstream_tao_apps/deepstream_faciallandmark_app.cpp at master · NVIDIA-AI-IOT/deepstream_tao_apps · GitHub, you can find how to use facial marks meta.

Hi, as i understood landmarks are show me position of face- Chin, eyes and etc. But I don’t need to understand position of peoples face I need to understand who is on the image. How can I use landmarks for recognition? I need to use them as embeddings?

I tried to run with facelandmarks but had this error:

ERROR: [TRT]: Cannot find binding of given name: softargmax,softargmax:1,conv_keypoints_m80

please provide the whole logs.

Now playing...
0:00:00.176642553 14122   0x55969710c0 INFO                 nvinfer gstnvinfer.cpp:638:gst_nvinfer_logger:<secondary1-nvinference-engine> NvDsInferContext[UID 2]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1914> [UID = 2]: Trying to create engine from model files
WARNING: [TRT]: onnx2trt_utils.cpp:364: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
WARNING: [TRT]: onnx2trt_utils.cpp:390: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:390: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: Missing scale and zero-point for tensor (Unnamed Layer* 47) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
WARNING: [TRT]: Detected invalid timing cache, setup a local cache instead
0:00:35.511542029 14122   0x55969710c0 INFO                 nvinfer gstnvinfer.cpp:638:gst_nvinfer_logger:<secondary1-nvinference-engine> NvDsInferContext[UID 2]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1947> [UID = 2]: serialize cuda engine to file: /home/cv/Desktop/Mask-Detection/Deepstream-app/model.etlt_b1_gpu0_int8.engine successfully
INFO: [Implicit Engine Info]: layers num: 4
0   INPUT  kFLOAT input_face_images 1x80x80         
1   OUTPUT kFLOAT conv_keypoints_m80 80x80x80        
2   OUTPUT kFLOAT softargmax      80x2            
3   OUTPUT kFLOAT softargmax:1    80              

ERROR: [TRT]: Cannot find binding of given name: softargmax,softargmax:1,conv_keypoints_m80
0:00:35.522664692 14122   0x55969710c0 WARN                 nvinfer gstnvinfer.cpp:635:gst_nvinfer_logger:<secondary1-nvinference-engine> NvDsInferContext[UID 2]: Warning from NvDsInferContextImpl::checkBackendParams() <nvdsinfer_context_impl.cpp:1868> [UID = 2]: Could not find output layer 'softargmax,softargmax:1,conv_keypoints_m80' in engine
0:00:35.534332596 14122   0x55969710c0 INFO                 nvinfer gstnvinfer_impl.cpp:313:notifyLoadModelStatus:<secondary1-nvinference-engine> [UID 2]: Load new model:/opt/nvidia/deepstream/deepstream-6.0/sources/apps/sample_apps/deepstream-infer-tensor-meta-test/dstest2_sgie1_config.txt sucessfully
Deserialize yoloLayer plugin: yolo_93
Deserialize yoloLayer plugin: yolo_96
Deserialize yoloLayer plugin: yolo_99
0:00:35.676323173 14122   0x55969710c0 INFO                 nvinfer gstnvinfer.cpp:638:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 40001]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1900> [UID = 40001]: deserialized trt engine from :/opt/nvidia/deepstream/deepstream-6.0/sources/apps/sample_apps/deepstream-infer-tensor-meta-test/model_b1_gpu0_fp16.engine
INFO: [Implicit Engine Info]: layers num: 4
0   INPUT  kFLOAT data            3x640x640       
1   OUTPUT kFLOAT yolo_93         24x80x80        
2   OUTPUT kFLOAT yolo_96         24x40x40        
3   OUTPUT kFLOAT yolo_99         24x20x20        

0:00:35.676711158 14122   0x55969710c0 INFO                 nvinfer gstnvinfer.cpp:638:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 40001]: Info from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2004> [UID = 40001]: Use deserialized engine model: /opt/nvidia/deepstream/deepstream-6.0/sources/apps/sample_apps/deepstream-infer-tensor-meta-test/model_b1_gpu0_fp16.engine
0:00:35.684813049 14122   0x55969710c0 INFO                 nvinfer gstnvinfer_impl.cpp:313:notifyLoadModelStatus:<primary-nvinference-engine> [UID 40001]: Load new model:/opt/nvidia/deepstream/deepstream-6.0/sources/apps/sample_apps/deepstream-infer-tensor-meta-test/dstest2_pgie_config.txt sucessfully
Decodebin child added: source
Decodebin child added: decodebin0
Running...
Decodebin child added: qtdemux0
Decodebin child added: multiqueue0
Decodebin child added: aacparse0
Decodebin child added: avdec_aac0
Decodebin child added: h264parse0
Decodebin child added: capsfilter0
Decodebin child added: nvv4l2decoder0
Opening in BLOCKING MODE 
NvMMLiteOpen : Block : BlockType = 261 
NVMEDIA: Reading vendor.tegra.display-size : status: 6 
NvMMLiteBlockCreate : Block : BlockType = 261 
In cb_newpad
In cb_newpad
End of stream
Returned, stopping playback
Deleting pipeline

I got video, but sgie_pad_buffer_probe() doesn’t work. Probably meta of landmarks getting it different way?

about “sgie_pad_buffer_probe() doesn’t work”, what do you mean? did not enter this callback function? or can’t get some meta?

Well, I puted some couts for debug and it seems like there is no user meta

So, can you please explain me how landmarks can help me to identify person? I googled and seen that landmarks can shows me position of eyes, eyebrows and etc. Maybe you can send me some git-hub repos with examples?

Here is config for landmarks:


[property]
gpu-id=0

tlt-model-key=nvidia_tlt
int8-calib-file=/home/cv/Desktop/Mask-Detection/Deepstream-app/int8_calibration.txt
tlt-encoded-model=/home/cv/Desktop/Mask-Detection/Deepstream-app/model.etlt

labelfile-path=/home/cv/Desktop/Mask-Detection/Deepstream-app/labels.txt
model-engine-file=/home/cv/Desktop/Mask-Detection/Deepstream-app/model.etlt_b1_gpu0_int8.engine
#onnx-file=/home/cv/Desktop/Mask-Detection/Deepstream-app/arcfaceresnet100-8.onnx

network-mode=1
num-detected-classes=1
uff-input-blob-name=input_1
output-blob-names=softargmax,softargmax:1,conv_keypoints_m80
#0=Detection 1=Classifier 2=Segmentation 100=other
network-type=100
# Enable tensor metadata output
output-tensor-meta=1
#1-Primary  2-Secondary
process-mode=2
gie-unique-id=2
operate-on-gie-id=1 
net-scale-factor=1.0
offsets=0.0
input-object-min-width=5
input-object-min-height=5
#0=RGB 1=BGR 2=GRAY
model-color-format=2

[class-attrs-all]
threshold=0.0