Deepstream-app not running when more than 12 cams or 3 models

• Hardware Platform (Jetson / GPU) GPU
• DeepStream Version 7.1
• TensorRT Version 10.6
• NVIDIA GPU Driver Version (valid for GPU only) 560.30.35
• Issue Type( questions, new requirements, bugs) Questions
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
I running NVIDIA-AI-IOT / deepstream_reference_apps version 7.1 modified deepstream_parallel_infer_app.cpp that body-pose related codes are removed.

Q1 app is not running

I have 3 difference yolox models.
If I run 1 models with 8 videos, it’s okay
If I run 2 models with 8 videos, it’s okay
But when I run 3 models with 9 videos or 2 models with 16 videos, its’ not okay…
Sometimes, 16vidoes with 2 models works…
But mostly it stuck at first 1 or 2 frame, shows few video frames and few green frames.
PERF shows 0 fps for all channels.

cpu: Intel(R) Xeon(R) W-2245 CPU @ 3.90GH
gpu: nvidia rtx A4000

Q2 converting to cv::Mat

I tried to convert mat in this way. but It not working properly…

for (l_frame = batch_meta->frame_meta_list; l_frame != NULL; l_frame = l_frame->next) {
    NvDsFrameMeta *frame_meta = (NvDsFrameMeta *) (l_frame->data);
    NvDsMetaList *l_next = NULL;

    if (surface && surface->surfaceList[frame_meta->batch_id].dataPtr) {
      int width = surface->surfaceList[frame_meta->batch_id].width;
      int height = surface->surfaceList[frame_meta->batch_id].height;

      unsigned char *cpu_frame = (unsigned char *) malloc(width * height * 4);
      cudaMemcpy(cpu_frame, surface->surfaceList[frame_meta->batch_id].dataPtr,
                 width * height * 4, cudaMemcpyDeviceToHost);

      cv::Mat frame(height, width, CV_8UC4, cpu_frame);         

Also I tried to use get_converted_mat() from /opt/nvidia/deepstream/deepstream-7.1/sources/gst-plugins/gstdsexample.cpp.
However, I have no clue to find correspoding variabledsexample in parallel-infer…

Your help would be greatly appreciated… I sincerely seek your kind assistance.
Thank you

sorry for the late replay! DS7.1 requires R535. please make sure the component version meets the requirement in this table.

Thank you for the reply!

I will install matched version and try again!

And how can I extract frame data from nvbuffer?

my code is below

  NvBufSurface *surface = (NvBufSurface *);
  for (l_frame = batch_meta->frame_meta_list; l_frame != NULL;
       l_frame = l_frame->next)
    NvDsFrameMeta *frame_meta = (NvDsFrameMeta *)(l_frame->data);
    NvBufSurfaceParams *params = &surface->surfaceList[frame_meta->batch_id];
      uint32_t width = params->width;
      uint32_t height = params->height;
      uint32_t pitch = params->pitch;
      void *data = params->mappedAddr.addr;

      // pitch= 2048, height=1080, width=1920
      // data type is NVBUF_COLOR_FORMAT_NV12_709
      // in GPU Mem, color type YUV, buf size  Y (WH) UV (WH/2)

      size_t bufferSize = (width * height) + (width * height / 2); // Y (width x height) + UV (width x height / 2)
      uint8_t* dstCPUBuffer = new uint8_t[bufferSize];

      // use cudamemcpy2d since pitch != width
      cudaError_t err = cudaMemcpy2D(dstCPUBuffer, width, data, pitch, width, height * 3 / 2, cudaMemcpyDeviceToHost);

      cv::Mat frame(params->height, params->width, CV_8UC4, hostBuffer);
      cv::Mat bgr;
      cv::cvtColor(frame, bgr, cv::COLOR_RGBA2BGR);


and I got error

CUDA memcpy failed: invalid argument

Is there any correct way to get frame data on cpu?
I need cpu cv::Mat…

Thank you in advance :)

please refer to this sample for how to access nv12 by opencv.

I followed the code snippet and got this error.


static GstPadProbeReturn
body_pose_gie_src_pad_buffer_probe(GstPad *pad, GstPadProbeInfo *info,
                          gpointer u_data)
  GstBuffer *buf = (GstBuffer *)info->data;
  NvDsMetaList *l_frame = NULL;
  NvDsBatchMeta *batch_meta = gst_buffer_get_nvds_batch_meta(buf);

  GstMapInfo in_map_info;
  if (!gst_buffer_map(buf, &in_map_info, GST_MAP_READ)) {
    g_print("Error: Failed to map gst buffer\n");
    return GST_PAD_PROBE_OK;
  NvBufSurface *surface = (NvBufSurface *);
  for (l_frame = batch_meta->frame_meta_list; l_frame != NULL;
       l_frame = l_frame->next)
    NvDsFrameMeta *frame_meta = (NvDsFrameMeta *)(l_frame->data);
    NvBufSurfaceParams *params = &surface->surfaceList[frame_meta->batch_id];
      if ( NvBufSurfaceMap (surface, -1, -1, NVBUF_MAP_READ) != 0 ){
        g_print("NvBufSurfaceMap failed..\n");


nvbufsurface: mapping of memory type (2) not supported
NvBufSurfaceMap failed..

I failed from the beginning with NvBufSurfaceMap.

According to the SDK API, NvBufSurfaceMap do

Maps hardware batched buffers to the HOST or CPU address space.

Valid for NVBUF_MEM_CUDA_UNIFIED type memory for dGPU and NVBUF_MEM_SURFACE_ARRAY and NVBUF_MEM_HANDLE type memory for Jetson.

How can I convert memory type of surface ?

Based on the example code from sample opt\nvidia\deepstream\deepstream-7.1\sources\gst-plugins\gst-dsexample\gstdsexample_optimized.cpp, I fixed my code as below.

static GstPadProbeReturn
body_pose_gie_src_pad_buffer_probe(GstPad *pad, GstPadProbeInfo *info,
                          gpointer u_data)
  gchar *msg = NULL;
  GstBuffer *buf = (GstBuffer *)info->data;
  NvDsMetaList *l_frame = NULL;
  NvDsMetaList *l_obj = NULL;
  NvDsMetaList *l_user = NULL;
  NvDsBatchMeta *batch_meta = gst_buffer_get_nvds_batch_meta(buf);

  GstMapInfo in_map_info;
  if (!gst_buffer_map(buf, &in_map_info, GST_MAP_READ)) {
    g_print("Error: Failed to map gst buffer\n");
    return GST_PAD_PROBE_OK;
  NvBufSurface *surface = (NvBufSurface *);

    NvBufSurfaceCreateParams create_params = { 0 };
    create_params.gpuId = 0;
    create_params.width = 1920;
    create_params.height = 1080;
    create_params.size = 0;
    create_params.colorFormat = NVBUF_COLOR_FORMAT_NV12_709;
    create_params.layout = NVBUF_LAYOUT_PITCH;
    create_params.memType = NVBUF_MEM_CUDA_UNIFIED;

  if (NvBufSurfaceCreate (&surface, 16, &create_params) != 0) {
          GST_ERROR ("Error: Could not allocate internal buffer for dsexample");
   // NvBufSurfaceMap is valid for NVBUF_MEM_CUDA_UNIFIED (3) type memory for dGPU
        if ( NvBufSurfaceMap (surface, -1, -1, NVBUF_MAP_READ) != 0 ){
        g_print("NvBufSurfaceMap failed..\n");

  NvBufSurfaceSyncForCpu (surface, 0, 0);

  for (l_frame = batch_meta->frame_meta_list; l_frame != NULL;
       l_frame = l_frame->next)
    NvDsFrameMeta *frame_meta = (NvDsFrameMeta *)(l_frame->data);
    NvBufSurfaceParams *params = &surface->surfaceList[frame_meta->batch_id];
    //g_print("default %d pined %d device %d, unified %d, this %d\n",  NVBUF_MEM_DEFAULT ,  NVBUF_MEM_CUDA_PINNED, \
            NVBUF_MEM_CUDA_DEVICE, NVBUF_MEM_CUDA_UNIFIED, surface->memType);
     // default 0 pined 1 device 2, unified 3, this 2 
    if (frame_meta->source_id < 4)
      guint height = surface->surfaceList[frame_meta->batch_id].height;
      guint width = surface->surfaceList[frame_meta->batch_id].width;
      cv::Mat nv12_mat = cv::Mat(height * 3 / 2, width, CV_8UC1, surface->surfaceList[frame_meta->batch_id].mappedAddr.addr[0], surface->surfaceList[frame_meta->batch_id].pitch);
      cv::Mat rgba_mat;
      cv::cvtColor(nv12_mat, rgba_mat, cv::COLOR_YUV2RGBA_NV12);
      std::string filename = "/workspace/tmpjpg/src" + std::to_string(frame_meta->source_id) + "_frame" + std::to_string(frame_meta->frame_num) + ".jpg";
      cv::imwrite(filename, rgba_mat);


  NvBufSurfaceUnMap(surface, 0, 0);

  gst_buffer_unmap(buf, &in_map_info);

  return GST_PAD_PROBE_OK;

and code works but image files are not save right…

its just 33kb and show green screen only…

Could you please help me ?

seems you create a new surface. please use the
NvBufSurface *surface = (NvBufSurface *); and make sure “sur->colorFormat” is nv12.

The color current color type is NVBUF_COLOR_FORMAT_NV12_709.

I checked color type by printing each variable.

static GstPadProbeReturn
body_pose_gie_src_pad_buffer_probe(GstPad *pad, GstPadProbeInfo *info,
                          gpointer u_data)
  GstBuffer *buf = (GstBuffer *)info->data;
  NvDsMetaList *l_frame = NULL;
  NvDsBatchMeta *batch_meta = gst_buffer_get_nvds_batch_meta(buf);

  GstMapInfo in_map_info;
  if (!gst_buffer_map(buf, &in_map_info, GST_MAP_READ)) {
    g_print("Error: Failed to map gst buffer\n");
    return GST_PAD_PROBE_OK;
  NvBufSurface *surface = (NvBufSurface *);
  NvBufSurfaceMap(surface, -1, -1, NVBUF_MAP_READ);
  NvBufSurfaceSyncForCpu (surface, 0, 0);

  for (l_frame = batch_meta->frame_meta_list; l_frame != NULL;
       l_frame = l_frame->next)
    NvDsFrameMeta *frame_meta = (NvDsFrameMeta *)(l_frame->data);
    NvBufSurfaceParams *params = &surface->surfaceList[frame_meta->batch_id];
    if (frame_meta->source_id == 0)
      g_print("Current color format is  %d \n" , params->colorFormat);
  NvBufSurfaceUnMap(surface, 0, 0);
  gst_buffer_unmap(buf, &in_map_info);
  return GST_PAD_PROBE_OK;
Current color format is  33 

If I use NvBufSurfaceMap I got memory type error.

nvbufsurface: mapping of memory type (2) not supported

According to the SDK API I mentiond above, NvBufSurfaceMap only work with
NVBUF_MEM_CUDA_UNIFIED which has value 3 while current mem type is 2.

Thus I tried to convert memory type by following example /opt/nvidia/deepstream/deepstream-7.1/sources/gst-plugins/gst-dsexample/gstdsexample_optimized.cpp.

`NVBUF_MEM_CUDA_DEVICE` (2) --> `NVBUF_MEM_CUDA_UNIFIED` (3) by new surface params.

As a results, it didn’t work…

Bottom line,

  • color type is NV12 ( NVBUF_COLOR_FORMAT_NV12_709 33)
  • NvBufSurfaceMap failed due to memory type

What can I do …?

please add a nvideoconvert before nvdsexample to output `NVBUF_MEM_CUDA_UNIFIED type, here is a sample:

gst-launch-1.0 uridecodebin uri=file:///opt/nvidia/deepstream/deepstream/samples/streams/sample_720p.mp4 ! nvvideoconvert nvbuf-memory-type=3 ! fakesink

I modified every nvbuf-memory-type in yml and it worked!!!

Thank you so much!!