Memory leak in deepstream-test4 python

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) GPU (Tesla T4)
• DeepStream Version 7.1
• TensorRT Version nvcr.io/nvidia/deepstream:7.1-triton-multiarch
• NVIDIA GPU Driver Version (valid for GPU only) 535
• Issue Type( questions, new requirements, bugs) bugs
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)

The issue is occuring in deepstream-test4 python sample application running with kafka. I have updated its code to use uridecodebin for rtsp source and set the message frequency to every 5th frame.

I ran 6 concurrent process with different kafka topics and observed that RAM usage continuously increases. Attached is the graph of container memory usage which clearly indicates memory leak issue.

python3 deepstream_test_4.py -i "rtsp://localhost:8554/test" -p /opt/nvidia/deepstream/deepstream-7.1/lib/libnvds_kafka_proto.so --conn-str="localhost;29092;prediction-1" --no-display

• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

We have multiple streams running on the client’s server, and after a week, the memory becomes full, causing the system to crash. The issue is most likely a memory leak occurring when allocating data in osd_sink_pad_buffer_probe for event msg meta and object meta. I also tested the same setup by commenting out osd_sink_pad_buffer_probe, and in that case, no memory leak was observed.
If possible, please confirm the issue and let me know if there are any available workarounds.

1.This is a known issue, but has not been released yet.
You can refer to the following patch to modify the local code, and recompile and install
/opt/nvidia/deepstream/deepstream/sources/libs/nvmsgconv and /opt/nvidia/deepstream/deepstream/sources/gst-plugins/gst-nvurisrcbin

diff --git a/src/gst-plugins/gst-nvurisrcbin/gstdsnvurisrcbin.cpp b/src/gst-plugins/gst-nvurisrcbin/gstdsnvurisrcbin.cpp
index da8bc70..db2883a 100644
--- a/src/gst-plugins/gst-nvurisrcbin/gstdsnvurisrcbin.cpp
+++ b/src/gst-plugins/gst-nvurisrcbin/gstdsnvurisrcbin.cpp
@@ -782,9 +782,17 @@
       config->smart_record = (NvDsUriSrcBinSRType) g_value_get_enum (value);
       break;
     case PROP_SMART_RECORD_DIR_PATH:
+      if (config->smart_rec_dir_path != NULL) {
+        g_free (config->smart_rec_dir_path);
+        config->smart_rec_dir_path = NULL;
+      }
       config->smart_rec_dir_path = g_value_dup_string (value);
       break;
     case PROP_SMART_RECORD_FILE_PREFIX:
+      if (config->smart_rec_file_prefix != NULL) {
+        g_free (config->smart_rec_file_prefix);
+        config->smart_rec_file_prefix = NULL;
+      }
       config->smart_rec_file_prefix = g_value_dup_string (value);
       break;
     case PROP_SMART_RECORD_VIDEO_CACHE:
@@ -854,6 +862,10 @@
       config->ipc_buffer_timestamp_copy = g_value_get_boolean (value);
       break;
     case PROP_IPC_SOCKET_PATH:
+      if (config->ipc_socket_path != NULL) {
+        g_free(config->ipc_socket_path);
+        config->ipc_socket_path = NULL;
+      }
       config->ipc_socket_path = g_value_dup_string (value);
       break;
     case PROP_IPC_CONNECTION_ATTEMPTS:
@@ -1063,12 +1075,23 @@
 {
   GstDsNvUriSrcBin *nvurisrcbin = GST_DS_NVURISRC_BIN (object);
 
-  if (nvurisrcbin->config->ipc_socket_path) {
-    g_free(nvurisrcbin->config->ipc_socket_path);
-    nvurisrcbin->config->ipc_socket_path = NULL;
+  if (nvurisrcbin->config) {
+    if (nvurisrcbin->config->ipc_socket_path) {
+      g_free(nvurisrcbin->config->ipc_socket_path);
+      nvurisrcbin->config->ipc_socket_path = NULL;
+    }
+    if (nvurisrcbin->config->smart_rec_dir_path) {
+      g_free(nvurisrcbin->config->smart_rec_dir_path);
+      nvurisrcbin->config->smart_rec_dir_path = NULL;
+    }
+    if (nvurisrcbin->config->smart_rec_file_prefix) {
+      g_free(nvurisrcbin->config->smart_rec_file_prefix);
+      nvurisrcbin->config->smart_rec_file_prefix = NULL;
+    }
+    g_free (nvurisrcbin->config);
+    nvurisrcbin->config = NULL;
   }
 
-  g_free (nvurisrcbin->config);
   destroy_smart_record_bin (nvurisrcbin);
 
   G_OBJECT_CLASS (parent_class)->finalize (object);
diff --git a/src/utils/nvmsgconv/deepstream_schema/eventmsg_payload.cpp b/src/utils/nvmsgconv/deepstream_schema/eventmsg_payload.cpp
index 39c72ab..a06925e 100644
--- a/src/utils/nvmsgconv/deepstream_schema/eventmsg_payload.cpp
+++ b/src/utils/nvmsgconv/deepstream_schema/eventmsg_payload.cpp
@@ -689,8 +689,8 @@
   json_object_set_object_member (objectObj, "pose", pobject);
 
   //===Embedding model full schema data population===
-  jobject = json_object_new ();
   if (meta->embedding.embedding_vector && meta->embedding.embedding_length) {
+    jobject = json_object_new ();
     json_object_set_int_member(jobject, "embedding_length", meta->embedding.embedding_length);
     JsonArray *embeddingArray = json_array_sized_new (meta->embedding.embedding_length);
     for (guint idx = 0; idx<meta->embedding.embedding_length; idx++){
@@ -701,8 +701,8 @@
   }
 
   // Single-view 3D Tracking metadata
-  jobject = json_object_new ();
   if(meta->has3DTracking) {
+    jobject = json_object_new ();
     json_object_set_double_member (jobject, "visibility", meta->singleView3DTracking.visibility);
 
     JsonArray *footLoc2DArray = json_array_sized_new (2);

2.It is recommended to use nvurisrcbin instead of uridecodebin, it has rtsp reconnection feature

3.Use the following command to monitor memory usage. Monitoring the memory usage of the entire docker cannot confirm that the problem is caused by DeepStream

PYTHONMALLOC=malloc valgrind --tool=memcheck --leak-check=full --show-leak-kinds=all \
 --suppressions=/usr/lib/valgrind/python3.supp  python3 deepstream_test_4.py -i \
file:///opt/nvidia/deepstream/deepstream/samples/streams/sample_720p.h264 \
-p /opt/nvidia/deepstream/deepstream/lib/libnvds_kafka_proto.so \
--conn-str="localhost;9092;test-topic" -s 0 --no-display

I have applied the fix in the nvmsgconv plugin, but the memory usage is still increasing. The fix seems to slow down the memory leak, but I can still see it rising after 6-10 hours.

I am attaching the code I used for testing. Also, I am running this test in a fresh container nvcr.io/nvidia/deepstream:7.1-triton-multiarch, so the entire Docker environment is only running the DeepStream process. I have montior it through docker stats as well as htop.

You can also increase message frequency to see the difference in RAM earlier.
deepstream_test_4.txt (19.8 KB)

Can you try using native code? I’m not sure if this problem is related to the Python VM.

Also, the command I provided above is to capture the location of the memory leak, not just to test for the memory leak.

I’m trying to reproduce the problem.

I tried running the native code and it seems to be working fine without any memory leaks for more than 6 hours.
I modified the native app to match the Python implementation, including updating the message frequency and changing the source to uridecodebin. It looks like the issue is specific to the Python application.

This is consistent with my test results and I will try to find the problem with the python program.

1 Like

Any updates or workaround for it.

In addition to the above patch, please add the following changes to bindschema.cpp, and then recompile and install pyds

diff --git a/bindings/src/bindschema.cpp b/bindings/src/bindschema.cpp
index 7085d3d..a163dd9 100644
--- a/bindings/src/bindschema.cpp
+++ b/bindings/src/bindschema.cpp
@@ -160,6 +160,7 @@ namespace pydeepstream {
                     g_free (srcData->extMsg);
                     srcData->extMsgSize = 0;
                 }
+                g_free(srcData);
             }
         }
     }