- Hardware: Jetson AGX Orin
- Deesptream7.0
- Jetpack 6.0 (6.0+b106 and 6.0+b87 both are installed) L4T 36.3.0
- TensorRT 8.6.2
- NVRM version: NVIDIA UNIX Open Kernel Module for aarch64 540.3.0
- Issue: i am trying to use deepstream-app for face recognition where i am using a face detector as primary model and then a embedding generator model to generate embeddings and then compare them with existing database using a custom parser of cosine similarity.
- BUT i am not able to see output labels for the respective class id’s that are being returned in the custom parser
- However the bboxes and the labels of primary detector are visible, i am facing issue with secondary only.
*As the models that i am using are custom models i am using network-type=3 in both the config files to call the custom parsers for both the models.
So why is the network-type
of sgie 3 (Instance Segmentation)?
From your description, I think the output of sgie
is a set of vectors(tensor), is that right?
You may need to configure sgie
as below.
## 0=Detector, 1=Classifier, 2=Segmentation, 100=Other
network-type=100
# Enable tensor metadata output
output-tensor-meta=1
Then
for (NvDsMetaList * l_user = obj_meta->obj_user_meta_list; l_user != NULL;
l_user = l_user->next) {
NvDsUserMeta *user_meta = (NvDsUserMeta *) l_user->data;
if (user_meta->base_meta.meta_type != NVDSINFER_TENSOR_OUTPUT_META)
continue;
/* convert to tensor metadata */
NvDsInferTensorMeta *meta =
(NvDsInferTensorMeta *) user_meta->user_meta_data;
// do face recognition and add label
}
You can refer to the sgie_pad_buffer_probe
function in /opt/nvidia/deepstream/deepstream/sources/apps/sample_apps/deepstream-infer-tensor-meta-test
Actually i wanted to use custom parser to load exisiting embedding files and to implement cosine similarity so i used the network-type=3
Yes, it generates 512 sized tensor embeddings
I tried to use network-type=100 but then the custom parser is not getting called during the inference.
i have already modified the deepstream_app.c code for my primary face detector to show 5 keypoints apart from bboxes given by model on the output video so i want to add face-recognition in deepstream-app only.
And here is the config file that i was using:
sgie_config.txt (648 Bytes)
OK, I understand what you mean. You want to use keypoints as instance mask. If so, network-type=3 is also OK.
Like this sample ?
So currently your problem is that custom parser is not being called?
Is the type of NvDsInferParseRec
like below?
extern "C"
bool NvDsInferParseCustomMrcnnTLT (std::vector<NvDsInferLayerInfo> const &outputLayersInfo,
NvDsInferNetworkInfo const &networkInfo,
NvDsInferParseDetectionParams const &detectionParams,
std::vector<NvDsInferInstanceMaskInfo> &objectList);
No, let me clarify:
- My pipeline is face_detection → gives faces bbox and 5 kps → face_recognition → uses image region of bbox to generate embeddings → compare them with existing data embeddings using cosine similarity.
- I have already modified deepstream-app for showing bboxes and kps in output using network-type=3 and it is working properly as kps and bboxes are visible.
- Now i want to show the face names stored in labels file by using secondary model, so tell me how can i use the generated embeddings and compare them with the database and then return an ID based on the cosine score and then based on the ID i want to show the corresponding face name from labels file
So, please tell me how can i show the labels of secondary model along with primary using deepstream-app, like what should i do?
1.Set the network-type
in the sgie configuration file to 1(Classifier)
2.Implementing a custom parser, Just Like
bool NvDsInferClassiferParseCustomSoftmax (std::vector<NvDsInferLayerInfo> const &outputLayersInfo,
NvDsInferNetworkInfo const &networkInfo,
float classifierThreshold,
std::vector<NvDsInferAttribute> &attrList,
std::string &descString)
{
// 1.get output tensor from outputLayersInfo
// 2.compare the output tensor with your database and get ID
// 3.Put the label corresponding to the ID into attrList
}
as per your advice i modified my custom parser as you can see in the following file
recog_parser.txt (3.9 KB)
and here is the config file:
sgie_config.txt (471 Bytes)
using this i face the following issue:
and here are other config files that i am using:
primary →
pgie_config.txt (750 Bytes)
app config →
main_app_config.txt (3.0 KB)
This seems to be a bug in your code. I can’t run this code.
The normal code should be like this:
attr.attributeLabel = strdup("your label") or nullptr;
Or you can debug it by gdb
gdb --args "your-prograrm xxx"
but i have a labels.txt file with 30 labels so is there any way to use it?
also when i used instance-segmentation as network type the above code worked properly
also please can you provide me any reference where deepstream-app is used with a primary detection and a secondary classification or recognition model.
When the output matches your database, how do the output IDs correspond to your labels.txt? Take label from labels.txt
and assign it to attributeLabel
.
Regarding this, it depends on how you use.
This doesn’t work. First, this is not an instance-segmentation model. Second, if you don’t set a label in the custom parser, it won’t show up in the osd.
This configuration file is deepstream-app with pgie + sgie (classifier), but I believe this is of no help to you.
/opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/source4_1080p_dec_infer-resnet_tracker_sgie_tiled_display_int8.txt
Your pgie is already working, just implement sgie as a classifier as I said above.
so you are saying that i should attach all of my 30 labels in this?
attr.attributeLabel = strdup(“A,B,C…”);
and also as i already said that i tried to implement the sgie as a classifier and i provided you respective files but i was facing the issue provided in above screenshot image.
and also let me clarify again:
for a single frame → detector detects multiple faces and pushes back class id=0 and confidence score in NvDsInferObjectDetectionInfo object → for this single frame only secondary model infers over all the faces/objects one by one → for each secondary inference over an object/face the generated embeddings are compared by 30 txt files each having a single face embedding in it → the file with which embedding matched will give me ID as the file name contains it → using that ID i can correlate to the names in labels.txt.
example embedding matched with file 23 then in labels.txt i have a name on 23rd line that i want to show over that face.
This is not the case. This should output the result of the comparison with the database. If there is no result, it will be assigned to nullptr
How to find the correct label from labels.txt, this is what your custom parser should implement
as i told you before i tried, attr.attributeLabel = strdup(“A,B,C…”);
just to check if i am able to see ABC on my video faces but there was nothing visible on output.
Also can you please see the following repo i tried to use and understand it so that i can implement similar logic for my case but when i tried to execute it, it got stuck after loading the models.
Sorry for the long delay.
Yes, this may help you understand the problem.
You can put something on the label until the osd can display it, and then continue the face recognition.
There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks