Object detection pre-trained model inference issue in deepstream

hey there,

actually i am trying to run a pre-trained insightface scrfd face detection model on the deepstream sdk 7.0 on a remote Jetson AGX Orin Board.

But i am facing issue with parsing of bounding boxes.

I have the simplified onnx format of the model which takes input in shape (1,3,640,640) and returns confidence score in following 3 bindings/tensors (12800,1)(3200,1)(800,1), bounding boxes in following 3 bindings/tensors (12800,4)(3200,4)(800,4),and 5 key-points in following 3 bindings/tensors (12800,10)(3200,10)(800,10).

Apart from onnx file i also converted onnx to trt engine file and tried to use that also but i am facing exactly same issue.

Here are the config files that i am using:
myconfig.txt (3.8 KB)
dstest1_pgie_config.txt (846 Bytes)

I also tried to use python bindings by running the following file but could not do it
copy_test1.txt (6.6 KB)

Can you illustrate your issue in detail?

hope u r able to see this else i am also sharing the same in text

dstest.txt.txt (4.9 KB)

You need to set the correct layer name in your config file and implement your own postprocess function to parse the output layer.

INFO: [Implicit Engine Info]: layers num: 10
0   INPUT  kFLOAT input.1         3x640x640       
1   OUTPUT kFLOAT 443             1               
2   OUTPUT kFLOAT 446             4               
3   OUTPUT kFLOAT 449             10              
4   OUTPUT kFLOAT 468             1               
5   OUTPUT kFLOAT 471             4               
6   OUTPUT kFLOAT 474             10              
7   OUTPUT kFLOAT 493             1               
8   OUTPUT kFLOAT 496             4               
9   OUTPUT kFLOAT 499             10   

the bounding boxes are obtained by all the layers with tensor size (12800,4)
(3200,4) (800,4) so what should i write exactly in the config file…please guide me with it…

You can just set the layer name to the config file. Then you need to implement your own postprocess function. You can refer to our demo deepstream_yolo.

but can i use any pre-built postprocessing function?
because before this when i tested a sample config file source30_1080p_dec_infer-resnet_tiled_display_int8.txt it worked without writing using any postprocessing

also i wanted to ask is it possible to make this process a bit more streamlined and easier by using python bindings?

btw here’s the model that i am using.

If you are using your own model, there are no any pre-built postprocessing function for that.

NO. This process does not require python. So it does not need to use python binding. If interested, you can take a look at our open source code about this part.

sources\libs\nvdsinfer\nvdsinfer_context_impl_output_parsing.cpp
NvDsInferStatus
DetectPostprocessor::fillDetectionOutput(
    const std::vector<NvDsInferLayerInfo>& outputLayers,
    NvDsInferDetectionOutput& output)

actually i have shared the model link above so if you can share any resources regarding that with respect to deepstream then that would be very helpful.

okay i will definitely go through this and also can you please tell me that in the config files that i am using are correct or not as i tried all the permutations and combinations of the output layer names that i have i.e 446,471,etc but still the error says that no bindings were found for the given output name

Whether you are using your own model or a third party model, you need to customize your postprocess based on the output of the model.
Please refer to our Gst-nvinfer Property Group Supported Keys. It’s delimited by the semicolon.

okay but i just wanted to know whether the rest of the config file is correct or not

As I attached, you have to add the semicolon to delimit the layers, like 443;446;449...

After your suggestions and reading the documents i have written this custom function
custom box parsing.txt (8.0 KB)
based on the following network graph that i obtained from netron app

  • Now you know the model and its output format so can you guide me whether the code will work or not and also how can i implement it like do i have to make other changes or precompile the code?
  • Also i wanted to ask one more question, actually i copied a file to the sources folder of opt/nvidia/deepstream/deepstream and i observed that it reflected in opt/nvidia/deepstream/deepstream-7.0 but vice versa is not true so in which directory should i make all the changes?

Use our demo project as example, you have to put your code in the nvdsinfer_custom_impl_Yolo dir and precompile that.
Then set the parse-bbox-func-name and custom-lib-path fields with your fuction and generated lib.

Deepstream dir is just a soft connection path link of deepstream-7.0.

Thanks for help. I have referred this example you have shared and I am able to build the library. I have also modified the config file to call my custom parser and library as given below:

deepstream-app config file:
myconfig.txt (4.6 KB)
primary config file:
dstest1_pgie_config_2.txt (2.9 KB)
custom parser file:
nvdsparsebbox_scrfd.txt (4.8 KB)

As a first step I just wanted to print confidence scores. However after executing the code I am getting it very low of the order of 0.014 or 0.015.

I verified the scores with my simplified onnx model and I found that scores are stored in a numpy array which is being referred as 1D Tensor in deepstream also when I checked the same using my custom parser and inferDims.numElements i am getting number of elements as 1 so i want to know how to access the whole array.

below is the output of my app with current configs:
forum_out.txt (7.6 KB)

You can refer to our source code in the Nsources\includes\nvdsinfer.h dir.

typedef struct
{
  /** Holds the data type of the layer. */
  NvDsInferDataType dataType;
  /** Holds the dimensions of the layer. */
  union {
      NvDsInferDims inferDims;
      NvDsInferDims dims _DS_DEPRECATED_("dims is deprecated. Use inferDims instead");
  };
  /** Holds the TensorRT binding index of the layer. */
  int bindingIndex;
  /** Holds the name of the layer. */
  const char* layerName;
  /** Holds a pointer to the buffer for the layer data. */
  void *buffer;
  /** Holds a Boolean; true if the layer is an input layer,
   or false if an output layer. */
  int isInput;
} NvDsInferLayerInfo;

The void *buffer is the address of the whole array. You need to parse the data from this buffer yourself based on the type of data output from your model.

I am using buffer variable only to print score/confidence. My model output layer for confidence has tensor 12800*1 so I checked buffer for this layer and the values are very low. My onnx-runtime model is working properly but my engine file is not detecting any face. Let me know how to resolve this.