How to do instance segmentation and embed the segmentation mask into a secondary detector for recognition?

Hello, I am using Nvidia Deepstream for instance segmentation. I want to use the mask in the instance segmentation to embed a secondary detector to detect objects in the mask. I currently have the main detector and the secondary detector configured, but my secondary detector does not detect objects in the mask, but detects objects in the detection box of the entire segmented object.

My pipeline is
rtmpsrc → flvdemux ->h264parse->nvv4l2decoder->streammux->nvinfer->queue->nvinfer->nvvidconv->nvosd->nvv4l2h264enc->h264parse->flvdemux->rtmpsink

nx@xavier:~$ sudo deepstream-app -v
deepstream-app version 6.3.0
DeepStreamSDK 6.3.0

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU)

• DeepStream Version

• JetPack Version (valid for Jetson only)

• TensorRT Version

• NVIDIA GPU Driver Version (valid for GPU only)

• Issue Type( questions, new requirements, bugs)

• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)

• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

Hardware Platform (Jetson / GPU) : NVIDIA GeForce RTX 3090
• DeepStream Version : 6.3
• JetPack Version (valid for Jetson only)
• TensorRT Version : 12.2
• NVIDIA GPU Driver Version (valid for GPU only) : 535.104.05

  1. About instance segmentation application, please refer to peopleSegNet. About detection model as secondary detector, please refer to back-to-back-detectors sample.
  2. what are the detailed inputs and outputs of two models? if the first model can output bboxes, you can ignore the mask because the second model only needs detection box of the entire segmented object.

I have completed the link between the first model and the second model. But now I want to use the mask area of ​​the first model into my second detection model instead of the detection box into the second model detection.

  1. about “but detects objects in the detection box of the entire segmented object.”, is detection box not a rectangle?
  2. nvinfer only supports rectangle area, polygon or Irregular area like mask are not supported. you need to use the rectangle coordinates which includes the mask.

OK, I thought nvinfer supported polygonal regions, so how do I get the coordinates of the masked region on the screen?

please refer to the point2 in my second comment. Taking peopleSegNet for example, this model can output bboxes and mask. sgie detection model only needs the bboxes. nvosd will draw the mask.

INFO: …/nvdsinfer/nvdsinfer_model_builder.cpp:610 [Implicit Engine Info]: layers num: 5
0 INPUT kFLOAT input 3x640x640
1 OUTPUT kFLOAT boxes 100x4
2 OUTPUT kFLOAT scores 100x1
3 OUTPUT kFLOAT classes 100x1
4 OUTPUT kFLOAT masks 100x160x160

nvosd will draw the mask. How do I get the coordinates of the mask?

" boxes 100x4" should be the bboxes. please refer to this configuration and code for how to get.

I think I found it.

/opt/nvidia/deepstream/deepstream/sources/apps/sample_apps/deepstream-mrcnn-app/deepstream_mrcnn_test.cpp

function = > generate_mask_polygon

yes, generate_mask_polygon can get the polygon coordinates of mask. In deepstream_mrcnn_test.cpp, these coordinates are used to send kafka.

Thank you so much

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.