Please provide the following information when requesting support.
• AGX Orin
• Detectnet_v2/ FaceDetect
I am currently working on a project that involves the conversion of an .etlt file to a TensorRT engine. I have successfully managed to carry out this conversion and have been analyzing the resulting outputs.
The engine provides two outputs: the category labels (probabilities associated with detected faces) and the bounding box coordinates for each detected face within the input image. I am finding the category labels to be consistent with the regions in the image where faces are present. However, I am encountering difficulty in interpreting the bounding box output, which according to the documentation, are “four normalized bounding-box parameters (xc, yc, w, h)”.
I am unclear on what these bounding box parameters are normalized with respect to. Additionally, I am unsure how to construct the final bounding box for each detected face using these parameters. To this end, I was wondering if you could provide further clarification on these points, or point me towards any Python implementation samples that demonstrate this. Somewhere in the deepstream files there must the postprocess hardcoded! Can you provide me with it?