Memory leak even after calling NvBufSurfaceUnMap

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) Jetson Orin AGX
• DeepStream Version 7.0
• JetPack Version (valid for Jetson only) 6.0
• TensorRT Version 8.6.2
• NVIDIA GPU Driver Version (valid for GPU only)
• Issue Type( questions, new requirements, bugs)
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

I am trying to access the frame in function gie_processing_done_buf_prob(). Below is a code snippet I have added to achieve the same. The code is running without any errors, however I found in jtop that the memory usage keeps on increasing. So I feel that there is a memory leak.



        // Declare surface buffer and map info for accessing buffer memory
        NvBufSurface *in_surf = nullptr;
        GstMapInfo in_map_info;
        memset(&in_map_info, 0, sizeof(in_map_info)); // Initialize map info structure
        // Map the GstBuffer to read data
        if (!gst_buffer_map(buf, &in_map_info, GST_MAP_READ)) {
            std::cerr << "Error: Failed to map GstBuffer" << std::endl;
            return GST_PAD_PROBE_OK;
        }
    
        // Assign the mapped buffer data to the surface pointer
        in_surf = reinterpret_cast<NvBufSurface *>(in_map_info.data);
      
        if (!in_surf) {
            std::cerr << "Error: NvBufSurface is null" << std::endl;
            gst_buffer_unmap(buf, &in_map_info);
            return GST_PAD_PROBE_OK;
        }
        
        // Map the surface to read memory for processing
        if (NvBufSurfaceMap(in_surf, -1, -1, NVBUF_MAP_READ) != 0) { 
            std::cerr << "Error: Failed to map NvBufSurface for read" << std::endl;
            return GST_PAD_PROBE_OK;
        }
        
        // Sync the surface memory for CPU access
        if (NvBufSurfaceSyncForCpu(in_surf, -1, -1) != 0) {
            printf("Sync CPU failed for in_surf\n");
        }
    
        // Check if the mapped address for surface is valid
        if (!in_surf->surfaceList[0].mappedAddr.addr[0]) {
            std::cerr << "Error: Mapped address is null for buffer " << std::endl;
        }
    
        // Get the surface dimensions (height, width, and pitch)
        guint height = in_surf->surfaceList[0].height;
        guint width = in_surf->surfaceList[0].width;
        guint pitch = in_surf->surfaceList[0].planeParams.pitch[0];
    
        // Create an OpenCV Mat in RGBA format using the surface data
        unsigned char *frame_data = (unsigned char *)in_surf->surfaceList[0].mappedAddr.addr[0];
        cv::Mat frame = cv::Mat(height, width, CV_8UC4, frame_data, pitch); // CV_8UC4: 8-bit unsigned, 4 channels (RGBA)
        
        // Convert the frame from RGBA to BGR for further processing
        cv::Mat in_frame_bgr;
        cv::cvtColor(frame, in_frame_bgr, cv::COLOR_RGBA2BGR);
        in_frame_bgr.convertTo(in_frame_bgr, CV_32FC3); // Convert to float
    
        // Unmap the surface after reading
        if (NvBufSurfaceUnMap(in_surf, -1, -1) != 0) {  
            std::cerr << "Error: Failed to unmap NvBufSurface" << std::endl;
            return GST_PAD_PROBE_OK;
        }

I have similar code in nvdspreprocess CustomTensorPreparation() and I do not see memory leak in preprocess. If I comment the above code in gie_processing_done_buf_prob() then the memory usage doesn’t increase.
Thanks

        in_surf = reinterpret_cast<NvBufSurface *>(in_map_info.data);
      
        if (!in_surf) {
            std::cerr << "Error: NvBufSurface is null" << std::endl;
            gst_buffer_unmap(buf, &in_map_info);
            return GST_PAD_PROBE_OK;
        }

+       gst_buffer_unmap(buf, &in_map_info);

Hi,
Thanks for your response. I am already calling gst_buffer_unmap(buf, &in_map_info); I just forgot to paste it in my code snippet above. Pardon me for that confusion.
Regards

@junshengy I am already calling gst_buffer_unmap(buf, &in_map_info); I just forgot to paste it in my code snippet above. Pardon me for that confusion. I am looking forward for a reply.
Regards

Please refer to this FAQ to detect the memory leak through valgrind

There is no way to judge whether there is a memory leak from only this code snippet

@junshengy I have already ran Valdrind by referring to the FAQ link you had shared. I have fixed one section where leak is happening. I have also gone through lots of the tickets present in the forum.

I am using deepstream-app and I have modified gie_processing_done_buf_prob() function. I found that if I comment the code I have added then there is no memory leak. I am checking memory consumption using JTOP. If I keep the code section I shared earlier then the memory consumption keeps on increasing and finally I got error as attached in screenshot


.

So I am sure that the above section only is causing the memory leak, please let me know next steps.
Regards

@junshengy, can I have a reply please.

As I mentioned above, there is nothing wrong with this code snippet, I ported this code to deepstream-test3 and did not notice any memory growth

deepstream_test3_app.cpp (22.4 KB)
This test program will keep running and will not exit.

 NVDS_TEST3_PERF_MODE=1 ./deepstream-test3-app file:///opt/nvidia/deepstream/deepstream/samples/streams/sample_720p.h264 

You’d better check other modifications to deepstream-app and related configuration files.

There are couple of file present in deepstream folder named rtpjitterbuffer_eos_handling.patch and update_rtpmanager.sh. Do I need to do anything with these to get rid of memory issue.

This patch is only to solve the rtsp stuck problem, it has nothing to do with your problem.

This problem is related to the code or configuration file you modified. I don’t know what you modified.

Have you changed the type of in_surf->memType?
For Jetson, it should usually be NVBUF_MEM_SURFACE_ARRAY Or NVBUF_MEM_DEFAULT

I checked the code and I am not changing memType.

@junshengy based on your comment, I checked the memory leak again using the process I mentioned earlier i.e. monitor RAM consumption using jtop and comment the specific code section. Since my application has pre-process function as well which is also using NvBufSurfaceMap() and NvBufSurfaceUnMap() there I didn’t find any memory leak, however when these functions are called in gie_processing_done_buf_prob() then I see memory leak. I am sure that it is due to the section which I have shared in my first post since when I comment it then I do not see any memory leak.
Can you check the same at your end.

I don’t see any memory leaks and the above code snippet doesn’t run directly because of errors with cv::Mat(height, width, CV_8UC4, frame_data, pitch). Below are my modifications

diff --git a/sources/apps/sample_apps/deepstream-app/deepstream_app.c b/sources/apps/sample_apps/deepstream-app/deepstream_app.c
index b0032c2..cda4048 100644
--- a/sources/apps/sample_apps/deepstream-app/deepstream_app.c
+++ b/sources/apps/sample_apps/deepstream-app/deepstream_app.c
@@ -15,6 +15,8 @@
 #include <math.h>
 #include <stdlib.h>
 #include <unistd.h>
+#include <opencv2/opencv.hpp>
+#include "nvbufsurface.h"
 
 #include "deepstream_app.h"
 
@@ -1039,6 +1041,64 @@ gie_processing_done_buf_prob (GstPad * pad, GstPadProbeInfo * info,
   guint index = bin->index;
   AppCtx *appCtx = bin->appCtx;
 
+  // Declare surface buffer and map info for accessing buffer memory
+  NvBufSurface *in_surf = nullptr;
+  GstMapInfo in_map_info;
+  memset(&in_map_info, 0, sizeof(in_map_info)); // Initialize map info structure
+  // Map the GstBuffer to read data
+  if (!gst_buffer_map(buf, &in_map_info, GST_MAP_READ)) {
+    std::cerr << "Error: Failed to map GstBuffer" << std::endl;
+    return GST_PAD_PROBE_OK;
+  }
+
+  // Assign the mapped buffer data to the surface pointer
+  in_surf = reinterpret_cast<NvBufSurface *>(in_map_info.data);
+
+  if (!in_surf) {
+    std::cerr << "Error: NvBufSurface is null" << std::endl;
+    gst_buffer_unmap(buf, &in_map_info);
+    return GST_PAD_PROBE_OK;
+  }
+  gst_buffer_unmap(buf, &in_map_info);
+
+  // Map the surface to read memory for processing
+  if (NvBufSurfaceMap(in_surf, -1, -1, NVBUF_MAP_READ) != 0) {
+    std::cerr << "Error: Failed to map NvBufSurface for read" << std::endl;
+    return GST_PAD_PROBE_OK;
+  }
+
+  // Sync the surface memory for CPU access
+  if (NvBufSurfaceSyncForCpu(in_surf, -1, -1) != 0) {
+    printf("Sync CPU failed for in_surf\n");
+  }
+
+  // Check if the mapped address for surface is valid
+  if (!in_surf->surfaceList[0].mappedAddr.addr[0]) {
+    std::cerr << "Error: Mapped address is null for buffer " << std::endl;
+  }
+
+  // Get the surface dimensions (height, width, and pitch)
+  guint height = in_surf->surfaceList[0].height;
+  guint width = in_surf->surfaceList[0].width;
+  guint pitch = in_surf->surfaceList[0].planeParams.pitch[0];
+
+  // printf("colorFormat nb %d\n", in_surf->surfaceList[0].colorFormat);
+  // Create an OpenCV Mat in RGBA format using the surface data
+  unsigned char *frame_data = (unsigned char *)in_surf->surfaceList[0].mappedAddr.addr[0];
+  // CV_8UC4: 8-bit unsigned, 4 channels (RGBA)
+  cv::Mat frame = cv::Mat(height, width, CV_8UC4, frame_data, pitch);
+
+  // Convert the frame from RGBA to BGR for further processing
+  cv::Mat in_frame_bgr;
+  cv::cvtColor(frame, in_frame_bgr, cv::COLOR_RGBA2BGR);
+  in_frame_bgr.convertTo(in_frame_bgr, CV_32FC3); // Convert to float
+
+  // Unmap the surface after reading
+  if (NvBufSurfaceUnMap(in_surf, -1, -1) != 0) {
+    std::cerr << "Error: Failed to unmap NvBufSurface" << std::endl;
+    return GST_PAD_PROBE_OK;
+  }
+
   if (gst_buffer_is_writable (buf))
     process_buffer (buf, appCtx, index);
   return GST_PAD_PROBE_OK;
@@ -1274,7 +1334,7 @@ create_processing_instance (AppCtx * appCtx, guint index)
   NVGSTDS_BIN_ADD_GHOST_PAD (instance_bin->bin, last_elem, "sink");
   if (config->osd_config.enable) {
     NVGSTDS_ELEM_ADD_PROBE (instance_bin->all_bbox_buffer_probe_id,
-        instance_bin->osd_bin.nvosd, "sink",
+        instance_bin->osd_bin.nvosd, "src",
         gie_processing_done_buf_prob, GST_PAD_PROBE_TYPE_BUFFER, instance_bin);
   } else {
     NVGSTDS_ELEM_ADD_PROBE (instance_bin->all_bbox_buffer_probe_id,
diff --git a/sources/apps/apps-common/src/deepstream_osd_bin.c b/sources/apps/apps-common/src/deepstream_osd_bin.c
index f9ad039..668024a 100644
--- a/sources/apps/apps-common/src/deepstream_osd_bin.c
+++ b/sources/apps/apps-common/src/deepstream_osd_bin.c
@@ -34,6 +34,11 @@ create_osd_bin (NvDsOSDConfig * config, NvDsOSDBin * bin)
     goto done;
   }
 
+  bin->cap_filter = gst_element_factory_make (NVDS_ELEM_CAPS_FILTER, "osd_caps");
+  GstCaps *caps = gst_caps_from_string ("video/x-raw(memory:NVMM), format=RGBA");
+  g_object_set (G_OBJECT (bin->cap_filter), "caps", caps, NULL);
+  gst_caps_unref (caps);
+
   bin->queue = gst_element_factory_make (NVDS_ELEM_QUEUE, "osd_queue");
   if (!bin->queue) {
     NVGSTDS_ERR_MSG_V ("Failed to create 'osd_queue'");
@@ -60,7 +65,7 @@ create_osd_bin (NvDsOSDConfig * config, NvDsOSDBin * bin)
       NULL);
 
   gst_bin_add_many (GST_BIN (bin->bin), bin->queue,
-      bin->nvvidconv, bin->conv_queue, bin->nvosd, NULL);
+      bin->nvvidconv, bin->cap_filter, bin->conv_queue, bin->nvosd, NULL);
 
   g_object_set (G_OBJECT (bin->nvvidconv), "gpu-id", config->gpu_id, NULL);
 
@@ -77,7 +82,9 @@ create_osd_bin (NvDsOSDConfig * config, NvDsOSDBin * bin)
 
   NVGSTDS_LINK_ELEMENT (bin->queue, bin->nvvidconv);
 
-  NVGSTDS_LINK_ELEMENT (bin->nvvidconv, bin->conv_queue);
+  NVGSTDS_LINK_ELEMENT (bin->nvvidconv, bin->cap_filter);
+
+  NVGSTDS_LINK_ELEMENT (bin->cap_filter, bin->conv_queue);
 
   NVGSTDS_LINK_ELEMENT (bin->conv_queue, bin->nvosd);
diff --git a/samples/configs/deepstream-app/source4_1080p_dec_infer-resnet_tracker_sgie_tiled_display_int8.txt b/samples/configs/deepstream-app/source4_1080p_dec_infer-resnet_tracker_sgie_tiled_display_int8.txt
index a2254c4..37a5110 100644
--- a/samples/configs/deepstream-app/source4_1080p_dec_infer-resnet_tracker_sgie_tiled_display_int8.txt
+++ b/samples/configs/deepstream-app/source4_1080p_dec_infer-resnet_tracker_sgie_tiled_display_int8.txt
@@ -55,8 +55,8 @@ cudadec-memtype=0
 [sink0]
 enable=1
 #Type - 1=FakeSink 2=EglSink/nv3dsink (Jetson only) 3=File
-type=2
-sync=1
+type=1
+sync=0
 source-id=0
 gpu-id=0
 nvbuf-memory-type=0
@@ -187,4 +187,4 @@ operate-on-class-ids=0;
 config-file=config_infer_secondary_vehiclemake.txt
 
 [tests]
-file-loop=0
+file-loop=1

This is not a deepstream problem, please debug it yourself. I used DS-7.1 to test it.

/deepstream-app -c /opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/source4_1080p_dec_infer-resnet_tracker_sgie_tiled_display_int8.txt

mem.log (32.1 KB)

Ok, Thanks for sharing the results. Let me check.

@junshengy I am using deepstream 7.0 so can you please check with it. May be this is an issue with DS7.0

If there is a problem with the legacy version, please use the latest version. We generally do not provide patches for the legacy version.