Can't draw inference results using the OSD binary for deepstream_pose_estimation when convert format to RGBA before demuxer

• Hardware Platform (Jetson / GPU): Jetson AGX Orin
• DeepStream Version: 6.1.1
• JetPack Version (valid for Jetson only): 5.0.2
• TensorRT Version: 8.4.1-1+cuda11.4
• Issue Type( questions, new requirements, bugs): questions

I use deepstream_pose_estimation for deploy human pose estimation application on DeepStream 6.1.1.
I clone the repository, mount it in deepstream-l4t container, and run it.

git clone https://github.com/NVIDIA-AI-IOT/deepstream_pose_estimation.git
cd deepstream_pose_estimation
sudo docker run -it --rm --runtime=nvidia --net=host -v ${PWD}:/tmp nvcr.io/nvidia/deepstream-l4t:6.1.1-iot

I replace the OSD binary for Jetson in /opt/nvidia/deepstream/deepstream/libs with the ones provided in this repository under bin/ .

cp /tmp/bin/Jetson/libnvds_osd.so /opt/nvidia/deepstream/deepstream/lib/libnvds_osd.so

I try to run the following command, it does not draw inference results.

gst-launch-1.0 -e filesrc location=/opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 ! \
  qtdemux ! queue ! h264parse ! nvv4l2decoder ! \
  mux.sink_0 nvstreammux name=mux batch-size=1 width=1920 height=1080 ! \
  nvinfer config-file-path=/opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_infer_primary.txt batch-size=1 \
    model-engine-file=/opt/nvidia/deepstream/deepstream/samples/models/Primary_Detector/resnet10.caffemodel_b1_gpu0_int8.engine ! \
  queue ! nvvideoconvert ! "video/x-raw(memory:NVMM), format=(string)RGBA" ! nvstreamdemux name=demux demux.src_0 ! \
  queue ! nvvideoconvert ! nvdsosd process-mode=CPU_MODE ! queue ! nvvideoconvert ! "video/x-raw(memory:NVMM),format=(string)I420" ! \
  nvv4l2h264enc ! h264parse ! qtmux ! filesink sync=false location=out.mp4

However, the following command draw inference results.
It only add nvvideoconvert of converting format to NV12 to the above command after nvstreamdemux.

gst-launch-1.0 -e filesrc location=/opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 ! \
  qtdemux ! queue ! h264parse ! nvv4l2decoder ! \
  mux.sink_0 nvstreammux name=mux batch-size=1 width=1920 height=1080 ! \
  nvinfer config-file-path=/opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_infer_primary.txt batch-size=1 \
    model-engine-file=/opt/nvidia/deepstream/deepstream/samples/models/Primary_Detector/resnet10.caffemodel_b1_gpu0_int8.engine ! \
  queue ! nvvideoconvert ! "video/x-raw(memory:NVMM), format=(string)RGBA" ! nvstreamdemux name=demux demux.src_0 ! \
  queue ! nvvideoconvert ! "video/x-raw(memory:NVMM), format=NV12" ! \
  queue ! nvvideoconvert ! nvdsosd process-mode=CPU_MODE ! queue ! nvvideoconvert ! "video/x-raw(memory:NVMM),format=(string)I420" ! \
  nvv4l2h264enc ! h264parse ! qtmux ! filesink sync=false location=out.mp4

Why does this phenomenon occur?

As a side note, if I do not replace the OSD binary, both of the above 2 commands draw inference results.

Hi, I simplified the pipeline a little bit, but I’m not seeing the issue you reported with the default nvdsosd binary.

gst-launch-1.0 \
uridecodebin uri=file:///opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 ! \
mux.sink_0 nvstreammux name=mux batch-size=1 width=1920 height=1080 ! \
nvvideoconvert ! \
nvinfer config-file-path=/opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_infer_primary.txt ! \
nvvideoconvert ! \
"video/x-raw(memory:NVMM), format=(string)RGBA" ! \
nvdsosd process-mode=CPU_MODE ! \
nvvideoconvert ! "video/x-raw(memory:NVMM),format=(string)I420" ! \
nvv4l2h264enc ! h264parse ! qtmux ! filesink sync=false location=out.mp4

Can you test with that pipeline and share the binaries you are using?

Thank you for confirming.

When I test with your pipeline and the default nvdsosd binary, it draw inference results.

But, if I test with the following pipeline and the custom nvdsosd binary, it does not draw inference results.

the custom nvdsosd binary is here.

gst-launch-1.0 \
uridecodebin uri=file:///opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 ! \
mux.sink_0 nvstreammux name=mux batch-size=1 width=1920 height=1080 ! \
nvvideoconvert ! \
nvinfer config-file-path=/opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/config_infer_primary.txt ! \
nvvideoconvert ! \
"video/x-raw(memory:NVMM), format=(string)RGBA" ! \
nvstreamdemux name=demux demux.src_0 ! \
nvdsosd process-mode=CPU_MODE ! \
nvvideoconvert ! "video/x-raw(memory:NVMM),format=(string)I420" ! \
nvv4l2h264enc ! h264parse ! qtmux ! filesink sync=false location=out.mp4

With the upgrade of the Deepstream version, using old libs may not be compatible. The lib is for CUDA 10.2. If there is no problem with the new version, you don’t need to pay attention to the issue with this old library anymore.

Could you upgrade the osd library for Jetson of deepstream_pose_estimation, or explain the cause occured this phenomenon when I convert format to RGBA before demuxer?

There may be some bugs in the osd plugin or nvstreamdemux plugin. If there are no issues with the latest version, we will not provide further support for the previous version.
So if you don’t replace the osd library, will it run normally in the latest version?

Yes, but I can’t apply the changes of the osd library described in step3 of this page to the output of pipeline.

As what you said, if you don’t replace the osd lib in the latest version, it works well.
This demo has not been maintained for a long time. Could you refer to the version we have been maintaining below?
https://github.com/NVIDIA-AI-IOT/deepstream_tao_apps/tree/master/apps/tao_others/deepstream-bodypose2d-app

Thank you for the information.

I have two questions about deepstream-bodypose2d-app.

  1. I use trt_pose as a model of deepstream_pose_estimation.
    Can I use it on deepstream-bodypose2d-app?

  2. By this page, the changes of the osd library are to ignore extraneous values when drawing.
    Is the process of this changes included in this?

1.No. You should use the model provided in this demo. The postprocess in the code is matched with the model.
2.There are basically no extraneous values for this new model.

Since there is no problem without replacing the osd library. Why do you still want to replace the osd lib and try to locate the problem by replacing the lib?

I already developed the application using trt_pose.
So, I want to modify this problem without major changes.

Could you explain what fixes you have made to the osd library from the original?

Because with the upgrade of the version, there are too many patches and new features to add. It is very difficult to find the specific fixes. So if the latest version works well, we won’t go back to locate this specific problem. Thanks

Do you mean that there are changes other than adding processes to ignore feature points that are outside the frame of the video buffer?

Yes. In a new version, we will have many patches to fix different issues and add new features. The demo you used is Deepstream 5.0. It’s a bit old. So there are too much more changes for the plugin.