Deepstream nvdsvideotemplate custom process buffer for image processing

• Hardware Platform (Jetson / GPU) : Jetson
• DeepStream Version : DS6.2
• JetPack Version (valid for Jetson only) : JP5.1.2
• Issue Type( questions, new requirements, bugs) : Question

Hi,
I want to apply some cuda kenel based image processing that is transforming the input buffer and pushing to the downstream. (for example remapping the input buffer to the other shape of the image and push it to the output buffer)

but I’m struggling how to use NvBufSurface to be processed, with the code below I gets this error

terminate called after throwing an instance of 'cv::Exception'
  what():  OpenCV(4.5.4) /opt/nvidia/deepstream/deepstream-6.2/opencv_contrib-4.5.4/modules/cudafilters/src/cuda/row_filter.hpp:172: error: (-217:Gpu API call) an illegal memory access was encountered in function 'caller'

Q. I attached the code below, could you help me how to implement what I intended to do?

  • I wanted to use NV12 input/ouput, however I started with RGBA format.
... ! nvvideoconvert ! 'video/x-raw(memory:NVMM), format=RGBA, width=3840, height=2160' ! nvdsvideotemplate customlib-name="./customlib_impl/libcustom_videoimpl.so" customlib-props="scale-factor:2.0" ! ...
  • I’m modifiying SampleAlgorithm::ProcessBuffer(GstBuffer *inbuf) in the file gst-nvdsvideotemplate/customlib_impl/customlib_impl.cpp
// in ProcessBuffer
...
// Push buffer to process thread for further processing
  PacketInfo packetInfo;
  packetInfo.inbuf = inbuf;
  packetInfo.frame_num = m_frameNum;

  // Add custom preprocessing logic if required, here
  // Pass the buffer to output_loop for further processing and pusing to next component
  // Currently its just dumping few decoded video frames

  /////////// START /////////////
  static bool create_filter = true;
  static cv::Ptr<cv::cuda::Filter> filter;

  NvBufSurfaceMap(in_surf, 0, 0, NVBUF_MAP_READ_WRITE);
  cuCtxSynchronize();
  if (create_filter)
  {
    filter = cv::cuda::createSobelFilter(CV_8UC4, CV_8UC4, 1, 0, 3, 1, cv::BORDER_DEFAULT);
    // filter = cv::cuda::createGaussianFilter(CV_8UC4, CV_8UC4, cv::Size(31,31), 0, 0, cv::BORDER_DEFAULT);
    create_filter = false;
  }
  cv::cuda::GpuMat test = cv::cuda::GpuMat(in_surf->surfaceList[0].height, in_surf->surfaceList[0].width, CV_8UC4, (unsigned char *)in_surf->surfaceList[0].dataPtr);
  
  // this is an example
  // I want to use custom cuda kernel once this test works.
  filter->apply(test, test);
 
  cuCtxSynchronize();
  NvBufSurfaceUnMap(in_surf, 0, 0);
  //////////  END //////////////

  // Enable for dumping the input frame, for debugging purpose
  if (0)
    DumpNvBufSurface(in_surf, batch_meta);

  m_processLock.lock();
  m_processQ.push(packetInfo);
  m_processCV.notify_all();
  m_processLock.unlock();
...

The default memory type on Jetson is nvbuf-mem-surface-array. Could you try to set that to the cuda memory and set the compute-hw to GPU?

1 Like

you are amazing…
I thought there is something wrong in the code…

for other’s reference, I changed the gstreamer option
from ! nvvideoconvert !,
to ! nvvideoconvert nvbuf-memory-type=1 compute-hw=1 !

@yuweiw
btw,
do you see any unnecessary lines of cude such as
NvBufSurfaceMap and cuCtxSynchronize?

or are they necesarry?

Thank you.

NO. You can refer to our Guide NvBufSurfaceMap. This is used for mapping hardware batched buffers to the HOST or CPU address space. Since you have been using GPU buffer, they are not necesarry.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.