Jetson Nano
Deepstream 4.0.1
Jetpack 4.2.2
TensorRT 5.1.6.1
CUDA 10.0
How can we do face recognition using Deepstream? Actually I want to do verification i.e. embedding matching and not classification. Is there any way to achieve this in Deepstream? I want to use Facenet and extract 128 embeddings from a face detected using webcam and compare with the 128 embeddings stored in the same folder to determine if it is the same person or not.
When we use neural network we use convolutional layers+pooling layers and maybe batch normalisation layers. after these layers we flatten it and then put the classifier layers which are basically the dense layers. So if we want to do face verification, we only need the conv+pooling+batch norm layers which will extract the features of tensor shape (1,128) in case of facenet or (1, 2048) in case of VGGFace for example. We do not need the flatten and dense layers for verification because we are not classifying. So embeddings are these features that were extracted from the architecture without the flatten and dense layers. Is there a way to get these in Deepstream?
Do you mean you want to analyse raw tensor outputs from conv+pooling+bn in your own ways (non conventional processing that is different from flatten+fully_connected+softmax)?
Yes that is correct. Since I only need to verify if the person in front of the camera is the person we are looking for, I do not need a classifier. Just the raw tensor outputs like you mentioned.
I think you have to implement your own parse function to handle network outputs.
You can find this piece of code (this piece of code is from DeepStream 5.0, but I think DeepStream 4.0 should be similar)
/* Call custom parsing function if specified otherwise use the one
* written along with this implementation. */
if (m_CustomClassifierParseFunc)
{
if (!m_CustomClassifierParseFunc(outputLayers, m_NetworkInfo,
m_ClassifierThreshold, attributes, attrString))
{
printError("Failed to parse classification attributes using "
"custom parse function");
return NVDSINFER_CUSTOM_LIB_FAILED;
}
}
else
{
if (!parseAttributesFromSoftmaxLayers(outputLayers, m_NetworkInfo,
m_ClassifierThreshold, attributes, attrString))
{
printError("Failed to parse bboxes");
return NVDSINFER_OUTPUT_PARSING_FAILED;
}
}
This piece of code indicates that you can implement a customized classifier parse function of your own.
And then you have to add following configurations to [property] like this so that DeepStream will call your customized parse function:
Okay. I got the idea behind how to do it. Just have to do it now. Thank you so much. I have shifted to Deepstream 5.0 now. But I still need to do this. Thank you again.
@preronamajumder were you able to figure this out or implement it? I have very similar use case to yours. It would be great to know how this was achieved.
I did write a parser function but I was unable to compile it successfully. I was getting Unidentified symbol with Cudastream while creating shared library. So I gave up on this. But there is a Git Repo for facenet implementation. Only issue with that is that it does not give same values as the actual facenet by David Sandberg. Probably related to preprocessing.