Feature extraction via SGIE

Hi there!

I am trying to do a recognition-based task on the Jetson Xavier for a project using the DS Python bindings.

I figured I could use the pgie for detection, and the sgie for feature extraction.
So far, the pgie works well for detection (courtesy of this repo - https://github.com/NVIDIA-AI-IOT/redaction_with_deepstream), but the sgie fails to feature extraction (I’m using an ONNX model).

Here’s my sgie config file:
[property]
gpu-id=0
onnx-file=…/arcfaceresnet100.onnx
batch-size=1
process-mode=1
model-color-format=0
network-mode=1
interval=0
gie-unique-id=2
output-blob-names=id
#parse-bbox-func-name=NvDsInferParseCustomResnet
#custom-lib-path=/path/to/libnvdsparsebbox.so
#enable-dbscan=1
operate-on-gie-id=1
operate-on-class-ids=0

Is this in the realm of using parse-bbox-bounding-box or just setting up the sgie config file appropriately to output feature vectors?

I’m running on Jetson AGX Xavier with DS 5.0 / Jetson 4.3 Developer Preview / TensorRT version 7.1.0-1+cuda10.2.

Thanks in advance for the help!

1 Like

May I know the model of your sgie? Can it use curent DeepStream post-processing?

I was trying to use ArcFace after extracting the bounding box…

As far as I know, it cannot use DeepStream post-processing, but I don’t know too much about that.

I found a TensorRT solution here: https://medium.com/@penolove15/face-recognition-with-arcface-with-tensorrt-abb544738e39

The same model (as implemented in ONNX) can be found here:

And the original original ArcFace can be found here:

Hi @patrick.g.tinsley,
Sorry for delay!
If you can’t use current post-processing DeepStream provided, you could refer to nvdsinfer_custombboxparser_ssd_tlt.cpp to register a post-processor to parse the inference output

Hi @mchi – thank you for your response!

I looked at the example you suggested with clipping/NMS, but I’m looking for a way for the parser to actually deal with the frame data itself in the context of feature extraction. Any ideas?

Thanks!

PT

sorry, I don’t get the point, could you make it more clear?

It seems that the custom-bbox-parser-function doesn’t interact with the frame pixel data itself, since it relies on either detect or classify output. However, the pixel data itself would be needed for extracting facial features for a given model. I suppose I could maybe add another parameter to the parser function itself (that being the raw data), but I was wondering if there is an existing pipeline for doing this type of detect-and-process-on-a-bbox task.

There is no update from you for a period, assuming this is not an issue any more.
Hence we are closing this topic. If need further support, please open a new one.
Thanks

Hi @patrick.g.tinsley
Sorry for long delay! Not sure if you have found out the solution. for your question, if you can acceess the corresponding frame, can it meet your requirement?

Thanks!