Cuda failure after runtime add and delete sources for multiple times

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU)
GPU
±----------------------------------------------------------------------------+
| NVIDIA-SMI 440.33.01 Driver Version: 440.33.01 CUDA Version: 10.2 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce RTX 208… On | 00000000:01:00.0 On | N/A |
| 37% 66C P2 154W / 250W | 6938MiB / 10997MiB | 94% Default |
±------------------------------±---------------------±---------------------+

• DeepStream Version

DeepStream version 4.0.2

• TensorRT Version
TensorRT --version 6.0.1

• NVIDIA GPU Driver Version (valid for GPU only)
440

I add the runtime add/delete function to deeepstream-app. However, after multiple times add and delete(6 sources for add/delete 6 rounds), the program crashed.
The error information is shown below:

Cuda failure: status=2
Error(-1) in buffer allocation

** (deepstream-app:11288): CRITICAL **: 15:09:24.333: gst_nvds_buffer_pool_alloc_buffer: assertion ‘mem’ failed
ERROR from src_bin_muxer: Failed to allocate the buffers inside the Nvstreammux output pool
Debug info: gstnvstreammux.c(566): gst_nvstreammux_alloc_output_buffers (): /GstPipeline:pipeline/GstBin:multi_src_bin/GstNvStreamMux:src_bin_muxer
Quitting

When program running, The GPU memory usage also keeping increasing until exhausted. It seems like the corresponding streammux buffer is not released after delete the source. Since streammux is not open source. I can’t get more information. I have done all functions in:


which is the logic in your runtime add/delete example on github.

Could you please help to find out where the problem is?

Hi @weiweifu
Which tracker are you using? Could you take a try without tracker in your custom code and see if that works. We found memory leak in the tracker which is getting fixed in 5.0 GA.

If it still does not work, could you share your change to reproduce & debug?

Thanks!

Hi, have you solved this problem? we also meet same problems.

Hi, I use KLT now. I don’t think my problem is related to deepstream’s own memory leak since when I disable tracker, memory leak problem is still exits in my program.

Below is the add function and delete function. I tried to set breakpoint in program to find out how much memory each section use. Like when I create a new source bin, how much memory do I use and when I delete it , how much memory is freed. But it is no use. When I call add function, the memory only changed after set_state to PLAYING and after come back to main_loop_run in main function. When I call delete function, the memory only changed after set_state to NULL. But it is obvious that deleted memory(~90MB) is less than added(~300MB). I guess it is may because the buffer pool in streammux is not freed after delete the source. As the times of add and delete increased, GPU memory was exhausted.

gboolean
stop_release_source (gint source_id)
{
GstStateChangeReturn state_return;
gchar pad_name[16];
GstPad *sinkpad = NULL;

state_return =
gst_element_set_state (appCtx[0]->pipeline.multi_src_bin.sub_bins[source_id].bin, GST_STATE_NULL);
guint camera_id = appCtx[0]->config.multi_source_config[source_id].camera_id;
g_print(“Camera_id of deleted source is :%d ,source_id is: %d \n”, camera_id, source_id);
switch (state_return) {
case GST_STATE_CHANGE_SUCCESS:
g_print (“STATE CHANGE SUCCESS\n\n”);
g_snprintf (pad_name, 15, “sink_%u”, source_id);
sinkpad = gst_element_get_static_pad (appCtx[0]->pipeline.multi_src_bin.streammux, pad_name);
gst_pad_send_event (sinkpad, gst_event_new_flush_stop (FALSE));
unlink_element_from_streammux_sink_pad(appCtx[0]->pipeline.multi_src_bin.streammux, appCtx[0]->pipeline.multi_src_bin.sub_bins[source_id].bin);
g_print (“SINK PAD-STATE CHANGE SUCCESS \n\n”);
gst_object_unref (sinkpad);
gst_bin_remove (GST_BIN (appCtx[0]->pipeline.multi_src_bin.bin), appCtx[0]->pipeline.multi_src_bin.sub_bins[source_id].bin);
appCtx[0]->pipeline.multi_src_bin.num_bins–;
appCtx[0]->config.num_source_sub_bins–;
memset(&appCtx[0]->config.multi_source_config[source_id],0,sizeof(appCtx[0]->config.multi_source_config[source_id]));
return TRUE;
break;
case GST_STATE_CHANGE_FAILURE:
g_print (“STATE CHANGE FAILURE\n\n”);
return FALSE;
break;
case GST_STATE_CHANGE_ASYNC:
g_print (“STATE CHANGE ASYNC\n\n”);
return TRUE;
break;
case GST_STATE_CHANGE_NO_PREROLL:
g_print (“STATE CHANGE NO PREROLL\n\n”);
break;
default:
break;
}
return TRUE;
}

gboolean
add_sources (gchar* new_camera_addr,guint source_id)
{
NvDsPipeline *pipeline = &appCtx[0]->pipeline;
GstStateChangeReturn state_return;
NvDsConfig *config=&appCtx[0]->config;
guint i;

parse_dir_source(&appCtx[0]->config,new_camera_addr,source_id);
if(!create_source_bin (&appCtx[0]->config.multi_source_config[source_id], &appCtx[0]->pipeline.multi_src_bin.sub_bins[source_id],source_id)){
g_print(“Failed to Create the New source_bin”);
return FALSE;
}
gst_bin_add (GST_BIN (appCtx[0]->pipeline.multi_src_bin.bin),appCtx[0]->pipeline.multi_src_bin.sub_bins[source_id].bin);
if (!link_element_to_streammux_sink_pad (appCtx[0]->pipeline.multi_src_bin.streammux,
appCtx[0]->pipeline.multi_src_bin.sub_bins[source_id].bin,source_id)) {
g_print(“Cannot Link With Streammux”);
}
appCtx[0]->pipeline.multi_src_bin.num_bins++;

state_return =
gst_element_set_state (appCtx[0]->pipeline.multi_src_bin.sub_bins[source_id].bin, GST_STATE_PLAYING);
switch (state_return) {
case GST_STATE_CHANGE_SUCCESS:
g_print (“STATE CHANGE SUCCESS\n\n”);
return TRUE;
break;
case GST_STATE_CHANGE_FAILURE:
g_print (“STATE CHANGE FAILURE\n\n”);
return FALSE;
break;
case GST_STATE_CHANGE_ASYNC:
g_print (“STATE CHANGE ASYNC\n\n”);
state_return =
gst_element_get_state (appCtx[0]->pipeline.multi_src_bin.sub_bins[source_id].bin, NULL, NULL,
GST_CLOCK_TIME_NONE);
return TRUE;
break;
case GST_STATE_CHANGE_NO_PREROLL:
g_print (“STATE CHANGE NO PREROLL\n\n”);
break;
default:
break;
}
return TRUE;
}

1 Like

Hi , could you please provide more details about your situation? I do not know whether we face the same problem or not. There are internal memory leak exits in DS-4.0 but mine is not related to that.

you can add my wechat [zongxp118] for more communication.

hi it shows that user does not exit when search the wechat id. would you mind checking if anything wrong? Or describe your procedure on this page so that others can also discuss with us.

Hi, I also test the official runtime-add-delete example.https://github.com/NVIDIA-AI-IOT/deepstream_reference_apps/tree/master/runtime_source_add_delete
I set the MAX_NUM_SOURCES= 8. When program begin with one source, the GPU memory usage is 1390 MB. But after delete all sources and before quit, the GPU memory usage is 1452MB. So from my point of view. there are 62MB memory not be released. Is that right? So which part causes memory leak?

1 Like

I did another test that disable YOLO/tracker/OSD seperately and GPU memory leak still exits. Then I tried only unlink source_bin and streammux, then link them. Repeat this progress again and again. I could also observe the memory leak. So my conclusion is that there are some space in streammux not released every time the sink pad of streammux was deleted. Small memory leak in repeated add/delete progress cumulated the CUDA failure in the end. Since streammux is not open source I can not dig deeper to debug. Please help.

1 Like

Hi, could you please share the link to the resource of your screenshot? I encounter the same memory leak problem and is looking forward to find more materials about NvStreammux.

I mean this one.

1 Like

this should be fixed in DS5.0,please take a try with DS5.0.

Thanks!

1 Like