Face Recognition in Deepstream?

My configurations

Jetson Nano
Deepstream 4.0.1
Jetpack 4.2.2
TensorRT 5.1.6.1
CUDA 10.0

How can we do face recognition using Deepstream? Actually I want to do verification i.e. embedding matching and not classification. Is there any way to achieve this in Deepstream? I want to use Facenet and extract 128 embeddings from a face detected using webcam and compare with the 128 embeddings stored in the same folder to determine if it is the same person or not.

1 Like

Sorry! What do you mean “embeddings”?

When we use neural network we use convolutional layers+pooling layers and maybe batch normalisation layers. after these layers we flatten it and then put the classifier layers which are basically the dense layers. So if we want to do face verification, we only need the conv+pooling+batch norm layers which will extract the features of tensor shape (1,128) in case of facenet or (1, 2048) in case of VGGFace for example. We do not need the flatten and dense layers for verification because we are not classifying. So embeddings are these features that were extracted from the architecture without the flatten and dense layers. Is there a way to get these in Deepstream?

@preronamajumder

Do you mean you want to analyse raw tensor outputs from conv+pooling+bn in your own ways (non conventional processing that is different from flatten+fully_connected+softmax)?

Yes that is correct. Since I only need to verify if the person in front of the camera is the person we are looking for, I do not need a classifier. Just the raw tensor outputs like you mentioned.

@preronamajumder

I think you have to implement your own parse function to handle network outputs.

You can find this piece of code (this piece of code is from DeepStream 5.0, but I think DeepStream 4.0 should be similar)

    /* Call custom parsing function if specified otherwise use the one
     * written along with this implementation. */
    if (m_CustomClassifierParseFunc)
    {
        if (!m_CustomClassifierParseFunc(outputLayers, m_NetworkInfo,
                m_ClassifierThreshold, attributes, attrString))
        {
            printError("Failed to parse classification attributes using "
                    "custom parse function");
            return NVDSINFER_CUSTOM_LIB_FAILED;
        }
    }
    else
    {
        if (!parseAttributesFromSoftmaxLayers(outputLayers, m_NetworkInfo,
                m_ClassifierThreshold, attributes, attrString))
        {
            printError("Failed to parse bboxes");
            return NVDSINFER_OUTPUT_PARSING_FAILED;
        }
    }

This piece of code indicates that you can implement a customized classifier parse function of your own.
And then you have to add following configurations to [property] like this so that DeepStream will call your customized parse function:

parse-classifier-func-name=name_of_your_own_customized_parse_function
custom-lib-path=dir_to_your_own_cpp_library/your_own_cpp_library.so

And your customized function would be like this (This is for DeepStream 5.0, I am not sure 4.0 is the same)

extern "C" bool
name_of_your_own_customized_parse_function(
    std::vector<NvDsInferLayerInfo> const& outputLayersInfo,
    NvDsInferNetworkInfo const& networkInfo, float classifierThreshold,
    std::vector<NvDsInferAttribute>& attrList, std::string& attrString)
{
    // TODO: Add your own parse logics here
    return true;
}

There are examples of customized parse functions here (The only difference is that they customize bbox parsing functions for detection networks):

objectDetector_FasterRCNN
objectDetector_SSD
objectDetector_Yolo

Okay. I got the idea behind how to do it. Just have to do it now. Thank you so much. I have shifted to Deepstream 5.0 now. But I still need to do this. Thank you again.

@preronamajumder were you able to figure this out or implement it? I have very similar use case to yours. It would be great to know how this was achieved.

I did write a parser function but I was unable to compile it successfully. I was getting Unidentified symbol with Cudastream while creating shared library. So I gave up on this. But there is a Git Repo for facenet implementation. Only issue with that is that it does not give same values as the actual facenet by David Sandberg. Probably related to preprocessing.