Cuda failure after runtime add and delete sources for multiple times

weiweifu · June 8, 2020, 8:24am

Please provide complete information as applicable to your setup.

• DeepStream Version

DeepStream version 4.0.2

• TensorRT Version
TensorRT --version 6.0.1

• NVIDIA GPU Driver Version (valid for GPU only)
440

I add the runtime add/delete function to deeepstream-app. However, after multiple times add and delete(6 sources for add/delete 6 rounds), the program crashed.
The error information is shown below:

Cuda failure: status=2
Error(-1) in buffer allocation

** (deepstream-app:11288): CRITICAL **: 15:09:24.333: gst_nvds_buffer_pool_alloc_buffer: assertion ‘mem’ failed
ERROR from src_bin_muxer: Failed to allocate the buffers inside the Nvstreammux output pool
Debug info: gstnvstreammux.c(566): gst_nvstreammux_alloc_output_buffers (): /GstPipeline:pipeline/GstBin:multi_src_bin/GstNvStreamMux:src_bin_muxer
Quitting

When program running, The GPU memory usage also keeping increasing until exhausted. It seems like the corresponding streammux buffer is not released after delete the source. Since streammux is not open source. I can’t get more information. I have done all functions in:

which is the logic in your runtime add/delete example on github.

Could you please help to find out where the problem is?

mchi · June 15, 2020, 6:20am

Hi @weiweifu
Which tracker are you using? Could you take a try without tracker in your custom code and see if that works. We found memory leak in the tracker which is getting fixed in 5.0 GA.

If it still does not work, could you share your change to reproduce & debug?

Thanks!

zongxp · June 19, 2020, 6:19am

Hi, have you solved this problem? we also meet same problems.

weiweifu · June 19, 2020, 7:13am

Hi, I use KLT now. I don’t think my problem is related to deepstream’s own memory leak since when I disable tracker, memory leak problem is still exits in my program.

Below is the add function and delete function. I tried to set breakpoint in program to find out how much memory each section use. Like when I create a new source bin, how much memory do I use and when I delete it , how much memory is freed. But it is no use. When I call add function, the memory only changed after set_state to PLAYING and after come back to main_loop_run in main function. When I call delete function, the memory only changed after set_state to NULL. But it is obvious that deleted memory(~90MB) is less than added(~300MB). I guess it is may because the buffer pool in streammux is not freed after delete the source. As the times of add and delete increased, GPU memory was exhausted.

gboolean
stop_release_source (gint source_id)
{
GstStateChangeReturn state_return;
gchar pad_name[16];
GstPad *sinkpad = NULL;

state_return =
gst_element_set_state (appCtx[0]->pipeline.multi_src_bin.sub_bins[source_id].bin, GST_STATE_NULL);
guint camera_id = appCtx[0]->config.multi_source_config[source_id].camera_id;
g_print(“Camera_id of deleted source is :%d ,source_id is: %d \n”, camera_id, source_id);
switch (state_return) {
case GST_STATE_CHANGE_SUCCESS:
g_print (“STATE CHANGE SUCCESS\n\n”);
g_snprintf (pad_name, 15, “sink_%u”, source_id);
sinkpad = gst_element_get_static_pad (appCtx[0]->pipeline.multi_src_bin.streammux, pad_name);
gst_pad_send_event (sinkpad, gst_event_new_flush_stop (FALSE));
unlink_element_from_streammux_sink_pad(appCtx[0]->pipeline.multi_src_bin.streammux, appCtx[0]->pipeline.multi_src_bin.sub_bins[source_id].bin);
g_print (“SINK PAD-STATE CHANGE SUCCESS \n\n”);
gst_object_unref (sinkpad);
gst_bin_remove (GST_BIN (appCtx[0]->pipeline.multi_src_bin.bin), appCtx[0]->pipeline.multi_src_bin.sub_bins[source_id].bin);
appCtx[0]->pipeline.multi_src_bin.num_bins–;
appCtx[0]->config.num_source_sub_bins–;
memset(&appCtx[0]->config.multi_source_config[source_id],0,sizeof(appCtx[0]->config.multi_source_config[source_id]));
return TRUE;
break;
case GST_STATE_CHANGE_FAILURE:
g_print (“STATE CHANGE FAILURE\n\n”);
return FALSE;
break;
case GST_STATE_CHANGE_ASYNC:
g_print (“STATE CHANGE ASYNC\n\n”);
return TRUE;
break;
case GST_STATE_CHANGE_NO_PREROLL:
g_print (“STATE CHANGE NO PREROLL\n\n”);
break;
default:
break;
}
return TRUE;
}

gboolean
add_sources (gchar* new_camera_addr,guint source_id)
{
NvDsPipeline *pipeline = &appCtx[0]->pipeline;
GstStateChangeReturn state_return;
NvDsConfig *config=&appCtx[0]->config;
guint i;

parse_dir_source(&appCtx[0]->config,new_camera_addr,source_id);
if(!create_source_bin (&appCtx[0]->config.multi_source_config[source_id], &appCtx[0]->pipeline.multi_src_bin.sub_bins[source_id],source_id)){
g_print(“Failed to Create the New source_bin”);
return FALSE;
}
gst_bin_add (GST_BIN (appCtx[0]->pipeline.multi_src_bin.bin),appCtx[0]->pipeline.multi_src_bin.sub_bins[source_id].bin);
if (!link_element_to_streammux_sink_pad (appCtx[0]->pipeline.multi_src_bin.streammux,
appCtx[0]->pipeline.multi_src_bin.sub_bins[source_id].bin,source_id)) {
g_print(“Cannot Link With Streammux”);
}
appCtx[0]->pipeline.multi_src_bin.num_bins++;

state_return =
gst_element_set_state (appCtx[0]->pipeline.multi_src_bin.sub_bins[source_id].bin, GST_STATE_PLAYING);
switch (state_return) {
case GST_STATE_CHANGE_SUCCESS:
g_print (“STATE CHANGE SUCCESS\n\n”);
return TRUE;
break;
case GST_STATE_CHANGE_FAILURE:
g_print (“STATE CHANGE FAILURE\n\n”);
return FALSE;
break;
case GST_STATE_CHANGE_ASYNC:
g_print (“STATE CHANGE ASYNC\n\n”);
state_return =
gst_element_get_state (appCtx[0]->pipeline.multi_src_bin.sub_bins[source_id].bin, NULL, NULL,
GST_CLOCK_TIME_NONE);
return TRUE;
break;
case GST_STATE_CHANGE_NO_PREROLL:
g_print (“STATE CHANGE NO PREROLL\n\n”);
break;
default:
break;
}
return TRUE;
}

weiweifu · June 19, 2020, 7:16am

Hi , could you please provide more details about your situation? I do not know whether we face the same problem or not. There are internal memory leak exits in DS-4.0 but mine is not related to that.

zongxp · June 22, 2020, 10:14am

you can add my wechat [zongxp118] for more communication.

weiweifu · June 23, 2020, 1:47am

hi it shows that user does not exit when search the wechat id. would you mind checking if anything wrong? Or describe your procedure on this page so that others can also discuss with us.

weiweifu · June 23, 2020, 6:29am

Hi, I also test the official runtime-add-delete example.deepstream_reference_apps/runtime_source_add_delete at master · NVIDIA-AI-IOT/deepstream_reference_apps · GitHub
I set the MAX_NUM_SOURCES= 8. When program begin with one source, the GPU memory usage is 1390 MB. But after delete all sources and before quit, the GPU memory usage is 1452MB. So from my point of view. there are 62MB memory not be released. Is that right? So which part causes memory leak?

weiweifu · June 26, 2020, 9:04am

I did another test that disable YOLO/tracker/OSD seperately and GPU memory leak still exits. Then I tried only unlink source_bin and streammux, then link them. Repeat this progress again and again. I could also observe the memory leak. So my conclusion is that there are some space in streammux not released every time the sink pad of streammux was deleted. Small memory leak in repeated add/delete progress cumulated the CUDA failure in the end. Since streammux is not open source I can not dig deeper to debug. Please help.

horacehxw · July 9, 2020, 5:47am

Hi, could you please share the link to the resource of your screenshot? I encounter the same memory leak problem and is looking forward to find more materials about NvStreammux.

horacehxw · July 9, 2020, 5:51am

I mean this one.

weiweifu · July 9, 2020, 6:23am

mchi · August 12, 2020, 5:20am

this should be fixed in DS5.0,please take a try with DS5.0.

Thanks!

muvva · September 21, 2021, 3:23am

Hi,

I used same above code for adding new source during the run time. I facing one issue. Below are the steps I followed to get working:

First create config.txt and it has RTSP source, primary-gie,secondary-gie and file sink.
I ran the Deepstream and it works fine with file sink data
I put the source (appCtx[0]->pipeline.multi_src_bin.sub_bins[0].bin) in PAUSED mode. FILE sink stopped to generate data
By using above code I added new source during the run time
New source (appCtx[0]->pipeline.multi_src_bin.sub_bins[1].bin) set as either PLAYING or PAUSED state
RTSP source always going to PLAYING state.
Expected result should be PAUSED

Please let me how to solve this problem?

muvva · September 21, 2021, 3:24am

One more point I missed: I enabled the smart record mode

Topic		Replies	Views
CUDA Memory Leak Problem While Runtime Souce Addition/Deletion DeepStream SDK cuda , gstreamer , nvbugs	3	1363	July 12, 2021
Runtime_source_add_delete error DeepStream SDK	1	941	August 7, 2020
Memory leak while resetting the stream DeepStream SDK deepstream , deepstream61	2	148	September 30, 2024
Runtime add/delete error with nvstreamdemux DeepStream SDK deepstream	7	222	August 28, 2024
Memory Leak During Runtime Addition and Deletion of Sources DeepStream SDK deepstream	3	173	June 25, 2025
Releasing streammux pad hangs gstreamer pipeline when adding and deleting rtsp source multiple times DeepStream SDK	36	2019	March 23, 2023
Run runtime_source_add_delete DeepStream SDK	3	474	September 14, 2021
Deepstream6.4 runtime add delete error DeepStream SDK	6	196	September 3, 2024
Deepstream runtime_source_add_delete.py application segmentation fault(core dumped) DeepStream SDK gstreamer , nvbugs , python , deepstream	8	1169	August 15, 2023
Can not dynamically re-add source after delete DeepStream SDK deepstream	20	1329	March 25, 2024

Cuda failure after runtime add and delete sources for multiple times

Related topics