Transfert Gst-dsexample-cuda for DS8.0 in DS7.1

Description

I tested the gst-dsexample-cuda plugin in DeepStream 8.0, and it works as expected for blurring bounding boxes.

However, the version available in DeepStream 7.1 is not sufficient for my use case because it relies on OpenCV CPU processing.

My objective is to use the CUDA-based plugin from DeepStream 8.0 and migrate it to DeepStream 7.1, if possible.

Environment

TensorRT Version:
GPU Type: Jetson Orin NX
Nvidia Driver Version:
CUDA Version: 12.6
CUDNN Version:
Operating System + Version:
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag): Container DS 7.1 triton multiarch

Steps To Reproduce

  1. Copy the plugin from DeepStream 8.0 to DeepStream 7.1:
/opt/nvidia/deepstream/deepstream/sources/gst-plugins/gst-dsexample-cuda
  1. Build OpenCV with CUDA support (no change):
cd ~
git clone https://github.com/opencv/opencv.git
git clone https://github.com/opencv/opencv_contrib.git

cd ~/opencv
mkdir build
cd build

sudo apt install -y cmake

cmake -D CMAKE_BUILD_TYPE=Release \
      -D CMAKE_INSTALL_PREFIX=/usr/local \
      -D OPENCV_EXTRA_MODULES_PATH=~/opencv_contrib/modules \
      -D WITH_CUDA=ON \
      -D CUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda \
      -D WITH_CUDNN=ON \
      -D OPENCV_DNN_CUDA=ON \
      -D BUILD_EXAMPLES=ON ..

make -j12
sudo make install
sudo ldconfig
  1. Update the plugin Makefile:

Change:

NVDS_VERSION := 8.0 → 7.1

and:

CFLAGS += -fPIC -DDS_VERSION=\"8.0.0\" \

to:

CFLAGS += -fPIC -DDS_VERSION=\"7.1.0\" \
  1. Build and install:
make
make install

  1. Then, adding the follwing config to the deepstreeam config file:
[ds-example]
enable=1
processing-width=640
processing-height=480
full-frame=0
#batch-size for batch supported optimized plugin
batch-size=2
unique-id=15
gpu-id=0
blur-objects=1
nvbuf-memory-type=0

Issue

This leads to the following runtime error:

ERROR from dsexample0: gst_dsexample_transform_ip: need NVBUF_MEM_CUDA_DEVICE memory for OpenCV CUDA blurring
Debug info:
gstdsexample_cuda.cpp(750): gst_dsexample_transform_ip ():
/GstPipeline:pipeline/GstBin:dsexample_bin/GstDsExample:dsexample0

If anyone has experience porting gst-dsexample-cuda from DeepStream 8.0 to 7.1, I would appreciate guidance on whether this memory type issue is related to NVBUF memory handling differences between versions.

Hey @valou ,

Moving this to DeepStream Forum as didn’t seems to be involving much TRT.

Sorry for the long delay. Did this topic still a issue ?

cv::cuda::GpuMat only supports constructing from a GPU buffer.
On Jetson, the memory type is nvbuf-mem-surface-array.

If you use nvvideoconvert to transform the NvBufSurface type, you will meet this error.

gstnvvideoconvert.c:4255:gst_nvvideoconvert_transform: buffer transform failed
/dvs/git/dirty/git-master_linux/nvutils/nvbufsurftransform/nvbufsurftransform.cpp:4543: => Surface type not supported for transformation NVBUF_MEM_CUDA_DEVICE

It meaning the gstdsexample_cuda cann’t support by Jetson since the low level library limition.

Thank you for the clarification.

If I understand correctly, the limitation does not come directly from DeepStream itself, but from the Jetson hardware architecture and the memory type being used (nvbuf-mem-surface-array instead of NVBUF_MEM_CUDA_DEVICE).

My goal is still to blur detected objects in an optimized way on the RTSP output pipeline on Jetson using DeepStream 7.1.

In your opinion, what would be the most promising and easiest approach to implement?

  • Using VPI for the blur operation?

  • Or adapting the gst-dsexample-cuda plugin to work directly with nvbuf-mem-surface-array instead of NVBUF_MEM_CUDA_DEVICE, if that is technically possible?

I am mainly looking for the most practical and maintainable solution on Jetson.

Is there perhaps another approach that you would recommend in this context?

Thank you.

In DS-8.0, the nvdsosd plugin added two properties, blur-bbox and blur-on-gie-class-ids , for blurring objects. This is implemented using CUDA; you can try porting this feature.

Refer to nvdsosd low level library at /opt/nvidia/deepstream/deepstream/sources/libs/nvll_osd

Thanks, i’ll work on it.