Bounding boxes visible but segmentation masks not displayed

Yusuf.IES · August 24, 2025, 11:16am

I am working on running a segmentation model (YOLOv8-seg ONNX) with DeepStream 7.1 on Jetson Orin Nano.
I am able to see bounding boxes from the parser output, but segmentation masks are not rendered on the display, even though nvdsosd is configured with display-mask=1.

Platform Details:

Hardware Platform: Jetson Orin Nano
JetPack Version: 6.2
DeepStream Version: 7.1
TensorRT Version: 10.3.0.30
NVIDIA GPU Driver Version: 540.4.0

Pipeline

nvstreammux name=mux width=800 height=600 batch-size=1 ! \
nvinferserver unique-id=1 config-file-path=./nodes/yolo11n-seg.onnx/config.txt ! \
nvdsosd display-text=1 display-bbox=1 display-mask=1 ! \

nvinferserver Config (excerpt)

postprocess {
  labelfile_path: "./postprocessing/labels.txt"
  detection {
    num_detected_classes: 80
    custom_parse_bbox_func: "NvDsInferParseCustom"
    nms { confidence_threshold: 0.3 iou_threshold: 0.4 }
  }
}

Custom C++ Parser Snippet

NvDsInferInstanceMaskInfo maskInfo;
maskInfo.width  = mask_resized_w;
maskInfo.height = mask_resized_h;
maskInfo.mask   = new float[mask_resized_w * mask_resized_h];

// Copy resized mask into maskInfo
for (int y = 0; y < mask_resized_h; y++) {
    for (int x = 0; x < mask_resized_w; x++) {
        float val = maskResized.at<float>(y, x);
        // tried both soft float [0,1] and thresholded binary {0,1}
        maskInfo.mask[y * mask_resized_w + x] = (val > 0.5f ? 1.0f : 0.0f);
    }
}
object.mask_info = maskInfo;

Issue

Bounding boxes appear as expected.
Segmentation masks are not displayed (even with nvdsosd display-mask=1).
Tried both soft masks (float [0,1]) and binary masks (0/1).
No error logs from nvinferserver or nvdsosd.

Could you please clarify:

Does nvdsosd / nvinferserver in DeepStream 7.1 require additional config to render segmentation masks along with bboxes?
Should the masks provided in NvDsInferInstanceMaskInfo.mask be strictly binary, or are probability maps (floating-point values), since the field is defined as float?
Is there a reference segmentation parser for nvinferserver along with config.txt that we can align with?

yuweiw · August 25, 2025, 2:32am

Please refer to our Mask2Former model in the deepstream-tao-app sample.

NvDsInferParseCustomMask2Former is the postprocess method for this model to parse the mask data.

Yusuf.IES · August 26, 2025, 5:09am

I have implemented Mask2Former exactly as in the reference parser:

github.com/NVIDIA-AI-IOT/deepstream_tao_apps

post_processor/nvdsinfer_custombboxparser_tao.cpp

master


      
          void copy_mask(float* dst, float* src, int w, int h,
              int mask_left, int mask_top, int mask_width, int mask_height) {
              int j = 0;
              for(int i = mask_top; i < mask_top + mask_height; i++){
                  float* pSrc = src + i*w + mask_left;
                  memcpy(dst + (j++)*mask_width, pSrc, mask_width*sizeof(float));
              }
          }
          
          extern "C"
          bool NvDsInferParseCustomMask2Former (std::vector<NvDsInferLayerInfo> const &outputLayersInfo,
                                             NvDsInferNetworkInfo  const &networkInfo,
                                             NvDsInferParseDetectionParams const &detectionParams,
                                             std::vector<NvDsInferInstanceMaskInfo> &objectList) {
              auto layerFinder = [&outputLayersInfo](const std::string &name)
                  -> const NvDsInferLayerInfo *{
                  for (auto &layer : outputLayersInfo) {
                      if ((layer.dataType == FLOAT || layer.dataType == INT32 || layer.dataType == INT64) &&
                        (layer.layerName && name == layer.layerName)) {
                          return &layer;
                      }

I also downloaded and used the official model from your repo:

github.com/NVIDIA-AI-IOT/deepstream_tao_apps

download_models.sh

master


      
          
          set -e
          
          echo "==================================================================="
          echo "begin download models for Faster-RCNN / YoloV3 / YoloV4 /SSD / DSSD / RetinaNet/ UNET/"
          echo "==================================================================="
          wget https://nvidia.box.com/shared/static/w0xxle5b3mjiv20wrq5q37v8u7b3u5tn -O models.zip
          unzip -o models.zip
          rm models.zip
          
          echo "==================================================================="
          echo "begin download models for Mask2Former "
          echo "==================================================================="
          mkdir -p models/mask2former
          cd ./models/mask2former
          wget --content-disposition 'https://api.ngc.nvidia.com/v2/models/org/nvidia/team/tao/mask2former/mask2former_swint_deployable_v1.0/files?redirect=true&path=mask2former_swint.onnx' \
          -O mask2former.onnx
          
          echo "==================================================================="
          echo "begin download models for peopleSemSegNet "
          echo "==================================================================="

My parser is unchanged from your sample.

Pipeline:

nvinferserver unique-id=1 config-file-path=config.txt ! \
nvtracker ll-lib-file=/opt/nvidia/deepstream/deepstream/lib/libnvds_nvmultiobjecttracker.so \
         ll-config-file=/opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_tracker_NvDCF_perf.yml \
         display-tracking-id=1 tracking-surface-type=0 tracking-id-reset-mode=0 ! \
nvdsosd display-text=1 display-bbox=1 display-mask=1

Model config (config.pbtxt):

name: "mask2former.plan"
platform: "tensorrt_plan"
max_batch_size: 1

input [
  {
    name: "inputs"
    dims: [3, 800, 800]
    data_type: TYPE_FP32
  }
]

output [
  {
    name: "pred_masks"
    dims: [100, 800, 800]
    data_type: TYPE_FP32
  },
  {
    name: "pred_scores"
    dims: [100]
    data_type: TYPE_FP32
  },
  {
    name: "pred_classes"
    dims: [100]
    data_type: TYPE_INT64
  }
]

Issue:
I can see bounding boxes as expected, but the segmentation masks are still not displayed, even though:

nvdsosd has display-mask=1.
The parser fills NvDsInferInstanceMaskInfo exactly like the reference.
No errors are reported by nvinferserver or nvdsosd.

Question:
Am I missing something in the pipeline or configuration to ensure that the masks are rendered?

yuweiw · August 26, 2025, 9:55am

Could you attach your command to run our Mask2Former? Theoretically, you don’t need to modify any code. Just run the sample follow the instructions in our Readme.

export SHOW_MASK=1; 
./apps/tao_detection/ds-tao-detection configs/app/ins_seg_app.yml

Yusuf.IES · August 26, 2025, 5:38pm

I’m able to successfully get both bounding boxes and segmentation masks when using an nvinfer pipeline. For example:

export SHOW_MASK=1
gst-launch-1.0 filesrc location=sample_720p.mp4 ! \
  qtdemux ! h264parse ! decodebin ! nvvidconv ! "video/x-raw(memory:NVMM),format=NV12" ! mux.sink_0 \
  nvstreammux name=mux width=800 height=600 batch-size=1 ! \
  nvinfer unique-id=1 config-file-path=mask2former.plan/config.txt ! \
  nvdsosd display-text=1 display-bbox=1 display-mask=1 ! \
  nvstreamdemux name=demux demux.src_0 ! queue ! nvvidconv ! \
  fpsdisplaysink video-sink="nveglglessink window-height=600 window-width=800"

However, when I switch to an nvinferserver pipeline, I only see bounding boxes (no masks):

export SHOW_MASK=1
gst-launch-1.0 filesrc location=sample_720p.mp4 ! \
  qtdemux ! h264parse ! decodebin ! nvvidconv ! "video/x-raw(memory:NVMM),format=NV12" ! mux.sink_0 \
  nvstreammux name=mux width=800 height=600 batch-size=1 ! \
  nvinferserver unique-id=1 config-file-path=mask2former.plan/config.txt ! \
  nvdsosd display-text=1 display-bbox=1 display-mask=1 ! \
  nvstreamdemux name=demux demux.src_0 ! queue ! nvvidconv ! \
  fpsdisplaysink video-sink="nveglglessink window-height=600 window-width=800"

The key difference is in the configuration files:

nvinfer → pgie_mask2former_tao_config.yml
nvinferserver → pgie_mask2former_tao_config.yml

I also tried running with the provided app:

export SHOW_MASK=1
./apps/tao_detection/ds-tao-detection configs/app/ins_seg_app.yml

With nvinfer inside that app config, I get proper masks. But when I change it to nvinferserver, only bounding boxes are shown.

Question:
Am I missing something in the nvinferserver config to enable mask visualization? It looks like the custom parser is working with nvinfer but not with nvinferserver.

yuweiw · August 27, 2025, 4:18am

Sorry, our current nvinferserver does not support the instance segment mask function. We suggest that you follow the steps below to implement this feature yourself.

By running the Mask2Former with nvinfer, you can get familiar with the code processing related to nvinfer.
Implement your yolo-seg model with nvinfer first.
Since our nvinfer and nvinferserver are all open source, you can implement it in the nvinferserver based on the process in nvinfer.

deepstream\sources\gst-plugins\gst-nvinfer
deepstream\sources\gst-plugins\gst-nvinferserver

Yusuf.IES · August 27, 2025, 6:02am

Yes, my understanding is that the custom parser is the same in both cases — whether I use nvinfer or nvinferserver, and bboxes and masks are then applied by nvdsosd

Both nvinfer and nvinferserver are mainly doing inference:

They call the backend (TensorRT for nvinfer, Triton for nvinferserver)
They get back raw tensors
Then the custom parser interprets those tensors into DeepStream objects.

In my case, in both plugins, my parser is appending results into

std::vector<NvDsInferInstanceMaskInfo> &objectList

with the same values (masks and bounding boxes).

So if the vector has the same mask data in both cases, what is the missing piece?

In nvinfer, the masks appear on screen.
In nvinferserver, the masks don’t show up.

Is the difference only in the way the mask metadata gets attached to NvDsObjectMeta or propagated downstream to nvdsosd?
Or does nvinferserver currently drop/ignore the mask metadata, even though the parser fills objectList correctly?

Basically: If the parser populates the same structure, why do masks get drawn in nvinfer but not in nvinferserver?

yuweiw · August 29, 2025, 8:56am

It’s only in the way the mask metadata gets attached to NvDsObjectMeta. The post-processing of nvinferserver does not attach the NvDsInferInstanceMaskInfo to the object.
You can refer to our source code deepstream\sources\libs\nvdsinferserver\infer_postprocess.cpp. This part did not implement the NvDsInferInstanceMaskInfo attached. You can follow the source code form the nvinfer to implement this function.

deepstream\sources\libs\nvdsinfer\nvdsinfer_context_impl_output_parsing.cpp
NvDsInferStatus
InstanceSegmentPostprocessor::fillDetectionOutput(
    const std::vector<NvDsInferLayerInfo>& outputLayers,
    NvDsInferDetectionOutput& output)

Yusuf.IES · August 31, 2025, 2:30pm

Thank you for the response. One option is to customize nvinferserver for instance segmentation, but another possible approach is to use the nvdspostprocess plugin.

I’d like to clarify:

Can we implement only instance segmentation within a custom nvdspostprocess library, without having to define additional parsers such as classification or detection?
Does nvdspostprocess support truly generic post-processing, where the buffer itself can be modified by a custom algorithm and a new buffer injected back into the pipeline?

If not, what would be the best method to achieve this?

yuweiw · September 1, 2025, 7:03am

Yes. It can support truly generic post-processing. We don’t have many samples at present. You can only refer to the source code deepstream\sources\gst-plugins\gst-nvdspostprocess. And we will provide an example in the next version soon.

Yusuf.IES · September 3, 2025, 12:43pm

I am trying to use the nvdspostprocess plugin for classification. Unlike nvinfer or nvinferserver, I don’t see an explicit config option to provide a path to a custom parser library.

In nvinfer/nvinferserver, we can pass a separate .so implementing a custom parser.
In nvdspostprocess, the only thing I see is that we pass:

nvdspostprocess postprocesslib-config-file=config_classifier_vehicle_type.yml \
                postprocesslib-name=./postprocesslib_impl/libpostprocess_impl.so

Inside the implementation, I only see the default parser defined in:
sources/gst-plugins/gst-nvdspostprocess/postprocesslib_impl/post_processor_classify.cpp

Example:

extern "C"
bool NvDsPostProcessClassiferParseCustomSoftmax(
    std::vector<NvDsInferLayerInfo> const &outputLayersInfo,
    NvDsInferNetworkInfo const &networkInfo,
    float classifierThreshold,
    std::vector<NvDsPostProcessAttribute> &attrList,
    std::string &descString);

But it’s not clear where such functions are expected to be loaded from.

My questions:

Am I expected to directly modify post_processor_classify.cpp and define my function there?
Is there a way to define this function in a separate file / .so and load it without rebuilding the whole postprocesslib_impl?
For complex pipelines, I would prefer defining custom parser libraries per inference instance, rather than combining everything into one big shared library. Is this supported, and how should it be configured?

yuweiw · September 4, 2025, 2:29am

You can implement your own algorithm by yourself in a separate file / .so. And implement the following interface in your own library.

deepstream\sources\gst-plugins\gst-nvdspostprocess\gstnvdspostprocess.cpp
...
      nvdspostprocess->algo_ctx =
        nvdspostprocess->algo_factory->CreateCustomAlgoCtx(nvdspostprocess->postprocess_lib_name,
         (DSPostProcess_CreateParams*) &params);
...

yingliu · September 26, 2025, 6:45am

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks.

system · October 10, 2025, 6:46am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
No detections on segmentation nvdspreprocessing+nvinferserver DeepStream SDK inference-server-triton , segmentation , deepstream	25	390	August 16, 2024
Issue With segmentation mask DeepStream SDK	13	775	February 27, 2024
How to using my custom semantic segmentation model? DeepStream SDK segmentation , deepstream	11	209	May 16, 2025
Unable to visualize and extract masks when using Mask-RCNN with default Deepstream App DeepStream SDK	9	830	October 12, 2021
Nv OSD mask DeepStream SDK python	3	591	October 12, 2021
Deepstream-app can not visual segmentation mask? DeepStream SDK	8	1156	October 12, 2021
How to use object detection and segmentation models in the same pipeline and get the outputs in the screen DeepStream SDK	13	2499	October 12, 2021
How to do instance segmentation and embed the segmentation mask into a secondary detector for recognition? DeepStream SDK	13	340	July 2, 2024
No Segmentation Meta Data (NvDsInferSegmentationMeta) built with custom segmentation model DeepStream SDK	7	1285	October 12, 2021
How to show mask object on DeepStream with Mask RCNN model Deep Learning (Training & Inference)	0	675	May 18, 2020

Bounding boxes visible but segmentation masks not displayed

Platform Details:

Pipeline

Issue

Related topics