No detections on segmentation nvdspreprocessing+nvinferserver

• Hardware Platform (Jetson / GPU) - dGPU
• DeepStream Version - 7.0
• Issue Type( questions, new requirements, bugs) - question

I have the following working pipeline that currently ingests a 1080p video stream and successfully detects, then displays bounding boxes w/ a TAO Toolkit-trained MaskRCNN model (to clarify, I do not want to show segmentations, but only bounding boxes):

rtmpsrc → queue → flvdemux → h264parse → nvv4l2decoder → nvstreammux → nvinferserver → nvdspostprocess → nvdosd → nvv4l2h264enc → h264parse → flvmux → rtmpsink

When I add the nvdspreprocess element to the pipeline, no detections are made, and thus no bounding boxes are displayed:

rtmpsrc → queue → flvdemux → h264parse → nvv4l2decoder → - nvstreammux → nvdspreprocess → nvinferserver → nvdspostprocess → nvdosd → nvv4l2h264enc → h264parse → flvmux → rtmpsink

Taking reference to these 2 related posts,

I noted that I’m using nvinferserver, and not invinfer. Then, I adapted some changes accordingly to accommodate using nvdspreprocess w/ nvinferserver as so:

[nvinferserver]

[nvdspreprocess]

  • Encompass the entirety of the video stream resolution (1920x1080), before adding it to the pipeline.
  • Inherit nvinferserver’s “preprocess” settings accordingly

Given that I’ve already added parameters related to my model in the nvdspreprocess settings based on the preprocess settings previously used in nvinferserver, I’m not sure why no detections are being made at all.

Attached are the config files used:

It takes some time to reproduce the issue, will be back soon after try.

Please use “process-on-roi=0” in the 2_config_preprocess.txt

Hi Fiona, I’m afraid I still have no detections even after making the switch to “process-on-roi=0”.

Additionally, as a sanity check, I also tried another experiment: using nvdspreprocess with the “deepstream-segmask” example on deepstream_python_apps/apps/deepstream-segmask at master · NVIDIA-AI-IOT/deepstream_python_apps · GitHub

This is what I uncovered:

  • deepstream-segmask (nvinfer) + nvdspreprocess successfully detects, then displays bounding boxes
  • deepstream-segmask (nvinferserver) + nvdspreprocess fails to detect, thus no bounding boxes displayed, even after switching to “process-on-roi=0”

Were you able to successfully get detections, and thus bounding boxes to display after using nvdspreprocess w/ nvinferserver and “process-on-roi=0”?

Does the nvinferserver configuration without nvpreprocess work?

Yes, it seems apparently so.

Can you send us your model and the configurations?

Certainly.

I’ve zipped the following "Case: “nvinferserver w/out nvdspreprocess” peopleSegNet successful detection files as an attachment:

  • model.plan (this is the TRT-optimized version from here)
  • config_infer_peoplesegnet.txt
  • peopleSegNet_labels.txt
  • config_mrcnn.yml

1_nvinferserver_wout_nvdspreprocess.zip (38.5 MB)

Just checking in - do let me know if there are any issues with the files that were provided.

Please provide ONNX model instead of TRT plan file

Here you go: files.zip (70.6 MB)
Decryption key for working w/ the .etlt file is: tlt_encode

I do not have an onnx file, as only an etlt file is provided from here, which was then converted into a model.plan file for use w/ nvinferserver.

The input layer’s dimension and name? Is it the same as PeopleSegNet | NVIDIA NGC?

Yes, correct.

What is your GPU?

RTX 3090

Do you mean you could not get the segmentation masks drawn on the video?

I’m afraid to say that it’s actually about bounding boxes, not segmentation masks.

I want to display bounding box results for a segmentation model, like PeopleSegNet. And in these 3 scenarios, I was able to do so:

Successfully displays bounding boxes (tested with custom Mask-RCNN and PeopleSegNet models)
(a) ⋯ nvstreammux → nvinfer → nvdspostprocess ⋯
(b) ⋯ nvstreammux → nvdspreprocess → nvinfer → nvdspostprocess ⋯
(c) ⋯ nvstreammux → nvinferserver → nvdspostprocess ⋯

However, the moment I try to use nvdspreprocess with nvinferserver, I am not able to display any bounding box results, like in these 2 scenarios:

Fails to display bounding boxes (tested with custom Mask-RCNN and PeopleSegNet models)
(d) ⋯ nvstreammux → nvdspreprocess(process-on-roi=1) → nvinferserver → nvdspostprocess ⋯
(e) ⋯ nvstreammux → nvdspreprocess(process-on-roi=0) → nvinferserver → nvdspostprocess ⋯

From these scenarios, what I’ve observed is:

  • nvinfer successfuly displays bounding boxes with or without nvdspreprocess
  • nvinferserver, unfortunately, can only display bounding boxes without nvdspreprocess
  • nvinferserver fails to display bounding boxes the moment nvdspreprocess element is used with it

Given that I need to display bounding boxes using nvinferserver with nvdspreprocess, like in scenarios (d) and (e), were you able to reproduce the same issue I faced?

By modify the peopleSegNet sample here deepstream_reference_apps/deepstream_app_tao_configs at master · NVIDIA-AI-IOT/deepstream_reference_apps (github.com), I see the bboxes with the attached configurations.
config_preprocess.txt (3.2 KB)
deepstream_app_source1_segmentation.txt (4.1 KB)

1 Like

Just to confirm:

For your settings in deepstream_app_source1_segmentation.txt, under [primary-gie], I see a reference to “triton/config_infer_primary_peopleSegNet.txt”, which has the following content:

 ⋮ 
preprocess {
    network_format: IMAGE_FORMAT_RGB
    tensor_order: TENSOR_ORDER_LINEAR
    tensor_name: "Input"
    maintain_aspect_ratio: 0
    frame_scaling_hw: FRAME_SCALING_HW_DEFAULT
    frame_scaling_filter: 1
    normalize {
      scale_factor: 0.017507
      channel_offsets: [123.675,116.280,103.53]
    }
  }
⋮

Given that I want to use nvdspreprocess’ outputs with nvinferserver, doesn’t this mean that the output from nvdspreprocess is not used?

As I understand (and correct me if I’m wrong) from: Gst-nvinferserver — DeepStream documentation 6.4 documentation, if I want to use nvdspreprocess’ output w/ nvinferserver, I need to disable nvinferserver’s preprocess by replacing it with the equivalent code instead, which means:

⋮ 
  p̶r̶e̶p̶r̶o̶c̶e̶s̶s̶ ̶{̶
  ̶ ̶ ̶ ̶ ̶n̶e̶t̶w̶o̶r̶k̶_̶f̶o̶r̶m̶a̶t̶:̶ ̶I̶M̶A̶G̶E̶_̶F̶O̶R̶M̶A̶T̶_̶R̶G̶B̶
  ̶ ̶ ̶ ̶ ̶t̶e̶n̶s̶o̶r̶_̶o̶r̶d̶e̶r̶:̶ ̶T̶E̶N̶S̶O̶R̶_̶O̶R̶D̶E̶R̶_̶L̶I̶N̶E̶A̶R̶
  ̶ ̶ ̶ ̶ ̶t̶e̶n̶s̶o̶r̶_̶n̶a̶m̶e̶:̶ ̶"̶I̶n̶p̶u̶t̶"̶
  ̶ ̶ ̶ ̶ ̶m̶a̶i̶n̶t̶a̶i̶n̶_̶a̶s̶p̶e̶c̶t̶_̶r̶a̶t̶i̶o̶:̶ ̶0̶
  ̶ ̶ ̶ ̶ ̶f̶r̶a̶m̶e̶_̶s̶c̶a̶l̶i̶n̶g̶_̶h̶w̶:̶ ̶F̶R̶A̶M̶E̶_̶S̶C̶A̶L̶I̶N̶G̶_̶H̶W̶_̶D̶E̶F̶A̶U̶L̶T̶
  ̶ ̶ ̶ ̶ ̶f̶r̶a̶m̶e̶_̶s̶c̶a̶l̶i̶n̶g̶_̶f̶i̶l̶t̶e̶r̶:̶ ̶1̶
  ̶ ̶ ̶ ̶ ̶n̶o̶r̶m̶a̶l̶i̶z̶e̶ ̶{̶
  ̶ ̶ ̶ ̶ ̶ ̶ ̶s̶c̶a̶l̶e̶_̶f̶a̶c̶t̶o̶r̶:̶ ̶0̶.̶0̶1̶7̶5̶0̶7̶
  ̶ ̶ ̶ ̶ ̶ ̶ ̶c̶h̶a̶n̶n̶e̶l̶_̶o̶f̶f̶s̶e̶t̶s̶:̶ ̶[̶1̶2̶3̶.̶6̶7̶5̶,̶1̶1̶6̶.̶2̶8̶0̶,̶1̶0̶3̶.̶5̶3̶]̶
  ̶ ̶ ̶ ̶ ̶}̶
  ̶ ̶ ̶}̶
  input_tensor_from_meta { 
    is_first_dim_batch: true 
  }
⋮

so that nvdspreprocess’ output can be used, instead of relying on nvdsinferserver’s preprocess settings.

This is necessary for me because I intend to tile the input video stream into multiple smaller ROIs for inference.

From what I understand (and correct me if I’m wrong), nvinferserver’s “preprocess” config doesn’t support this, which means the only way to do this is for me to disable nvinferserver’s “preprocess” and leave the preprocessing to be done entirely through nvdspreprocess.

The “input-tensor-meta=1” is set in deepstream_app_source1_segmentation.txt too. This will change the input of gst-nvinferserver. deepstream-app and gst-nvinferserver are all open source, you can read the code to understand how it works.