Dynamically deleting a stream causes a deadlock

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) A10
• DeepStream Version 6.0
• JetPack Version (valid for Jetson only)
• TensorRT Version 8.01
• NVIDIA GPU Driver Version (valid for GPU only) 470
• Issue Type( questions, new requirements, bugs) bug
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
I added the logic to deepstream-test5 to delete the stream according to deepStream_reference_apps/runtime_source_add_delete .
But gst_element_set_state (ELEM, GST_STATE_NULL) occasionally triggers deadlocks .
I asked the question, but I didn’t get an answer, so I had to ask the question again.

#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00007fdf37ad90f4 in __GI___pthread_mutex_lock (mutex=0x7fdb340013d0) at ../nptl/pthread_mutex_lock.c:115
#2  0x00007fdf3940e87f in ?? () from target:/usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
#3  0x00007fdf3940f335 in gst_pad_set_active () from target:/usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
#4  0x00007fdf393ecf0d in ?? () from target:/usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
#5  0x00007fdf393ff884 in gst_iterator_fold () from target:/usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
#6  0x00007fdf393eda16 in ?? () from target:/usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
#7  0x00007fdf393ef95e in ?? () from target:/usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
#8  0x00007fdf393efc06 in ?? () from target:/usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
#9  0x00007fdf39713365 in ?? () from target:/usr/lib/x86_64-linux-gnu/libgstvideo-1.0.so.0
#10 0x00007fddfade1f41 in gst_v4l2_video_dec_change_state () from target:/usr/lib/x86_64-linux-gnu/gstreamer-1.0/deepstream/libgstnvvideo4linux2.so
#11 0x00007fdf393f1d5e in gst_element_change_state () from target:/usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
#12 0x00007fdf393f2499 in ?? () from target:/usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
#13 0x00007fdf393cfa02 in ?? () from target:/usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
#14 0x00007fdf23e50029 in ?? () from target:/usr/lib/x86_64-linux-gnu/gstreamer-1.0/libgstplayback.so
#15 0x00007fdf393f1d5e in gst_element_change_state () from target:/usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
#16 0x00007fdf393f2499 in ?? () from target:/usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
#17 0x00007fdf393cfa02 in ?? () from target:/usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
#18 0x00007fdf23e6369a in ?? () from target:/usr/lib/x86_64-linux-gnu/gstreamer-1.0/libgstplayback.so
#19 0x00007fdf393f1d5e in gst_element_change_state () from target:/usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
#20 0x00007fdf393f2499 in ?? () from target:/usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
#21 0x00007fdf393cfa02 in ?? () from target:/usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
#22 0x00007fdf393f1d5e in gst_element_change_state () from target:/usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
#23 0x00007fdf393f2045 in gst_element_change_state () from target:/usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
#24 0x00007fdf393f2499 in ?? () from target:/usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
#25 0x0000559350e1e5b2 in stop_release_source(NvDsSrcParentBin*, int) ()
#26 0x0000559350e1f295 in event_thread_func_2(void*) ()
#27 0x00007fdf38e75e23 in ?? () from target:/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#28 0x00007fdf38e753a5 in g_main_context_dispatch () from target:/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#29 0x00007fdf38e75770 in ?? () from target:/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#30 0x00007fdf38e75a82 in g_main_loop_run () from target:/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#31 0x0000559350e21e92 in main ()

I got a reply from Gstreamer.They say it breaks while deactivating NVidia specific decoder in libgstnvvideo4linux2.so.

We will have intenal team to reviw to give some updates. Thanks

Thanks,I can provide complete communication with GStreamer.
gst_element_set_state (ELEm, GST_STATE_NULL) causes a deadlock (#1065) · Issues · GStreamer / gstreamer · GitLab

One more message,destroy_pipeline function also has a chance to trigger a deadlock.

Please provide your complete code, configurations and steps of reproducing the deadlock.

source.zip (522.4 KB)
I uploaded the code and removed the classified part.
If the compilation fails, you can delete the failed code.
This part of the code is based on destroy_pipeline and create_ pipeline function to complete the operation of dynamically deleting the stream.
command:
./deepstream-test5-app -c configs/test5_config_file_src_infer.txt

Hi @549981178 , I cannot run your code in my env cause I have no your model-engine-file(resnet50_224x224_8_32_16_20210927.trt). Could you provide the model-engine-file. Thanks
Or could you try to duplicate the dead lock with our demo code and demo model in order to analyze and solve the problems better?

Uploading: resnet50_224x224_8_32_16_20210927.onnx…
I think deepstream-test5-app is also your demo code and you can use demo model.
Because the problem is not related to the model.

Hi, @549981178 , Could you duplacated the dead_lock with the deepstream-test5-app without your own code?
Or just give us a simple diff between you own code and deepstream-test5-app? Thanks

You just need to add the following code to the deepstream_test5_app_main.c.
And, add g_timeout_add (3000, event_thread_func_2, NULL); to the main function.

static gboolean
add_sources (NvDsSourceConfig * config, NvDsSrcParentBin * bin, gint source_id)
{
    GstStateChangeReturn state_return;

    SPDLOG_LOGGER_INFO(logger, "start source {}", source_id);
    NvDsSrcBin new_bin = {};
    bin->sub_bins[source_id] = new_bin;
    gchar elem_name[50];
    g_snprintf (elem_name, sizeof (elem_name), "src_sub_bin%d", source_id);
    bin->sub_bins[source_id].bin = gst_bin_new (elem_name);
    if (!bin->sub_bins[source_id].bin) {
        NVGSTDS_ERR_MSG_V ("Failed to create '%s'", elem_name);
        return FALSE;
    }

    bin->sub_bins[source_id].bin_id = bin->sub_bins[source_id].source_id = source_id;
    config->live_source = TRUE;
    config->enable = TRUE;
    bin->live_source = TRUE;
    bin->sub_bins[source_id].eos_done = TRUE;
    bin->sub_bins[source_id].reset_done = TRUE;
//    bin->sub_bins[source_id].have_eos = FALSE;
    bin->sub_bins[source_id].parent_bin = bin;

    create_uridecode_src_bin (config, &bin->sub_bins[source_id]);
    gst_bin_add (GST_BIN (bin->bin), bin->sub_bins[source_id].bin);
    link_element_to_streammux_sink_pad (bin->streammux, bin->sub_bins[source_id].bin, source_id);
    state_return =
        gst_element_set_state (bin->sub_bins[source_id].bin, GST_STATE_PLAYING);
    switch (state_return) {
        case GST_STATE_CHANGE_SUCCESS:
            SPDLOG_LOGGER_INFO(logger, "GST_STATE_CHANGE_SUCCESS");
            break;
        case GST_STATE_CHANGE_FAILURE:
            SPDLOG_LOGGER_ERROR(logger, "GST_STATE_CHANGE_FAILURE");
            break;
        case GST_STATE_CHANGE_ASYNC:
            SPDLOG_LOGGER_INFO(logger, "STATE CHANGE ASYNC");
            do{
                state_return = gst_element_get_state (bin->sub_bins[source_id].bin, NULL, NULL,
                                           GST_CLOCK_TIME_NONE);
                sleep(1);
            }while(state_return != GST_STATE_CHANGE_SUCCESS);

            break;
        case GST_STATE_CHANGE_NO_PREROLL:
            SPDLOG_LOGGER_ERROR(logger, "STATE CHANGE NO PREROLL");
            break;
        default:
            break;
    }

    return TRUE;
}

void
stop_release_source (NvDsSrcParentBin *bin, gint source_id)
{
    SPDLOG_LOGGER_INFO(logger, "stop source {}", source_id);
    GstStateChangeReturn state_return;
    GstElement *pipeline = bin->bin;
    gchar pad_name[16];
    GstPad *sinkpad = NULL;
    GstElement *elem = bin->sub_bins[source_id].bin;
    state_return =
        gst_element_set_state (elem, GST_STATE_NULL);
    switch (state_return) {
        case GST_STATE_CHANGE_SUCCESS:
            SPDLOG_LOGGER_INFO(logger, "STATE CHANGE SUCCESS");
            g_snprintf (pad_name, 15, "sink_%u", source_id);
            sinkpad = gst_element_get_static_pad (bin->streammux, pad_name);
            gst_pad_send_event (sinkpad, gst_event_new_flush_stop (FALSE));
            gst_element_release_request_pad (bin->streammux, sinkpad);
            SPDLOG_LOGGER_INFO(logger, "release SUCCESS");
            gst_object_unref (sinkpad);
            gst_bin_remove (GST_BIN (pipeline), elem);
            SPDLOG_LOGGER_INFO(logger, "release SUCCESS2");
//            source_id--;
//            bin->num_bins--;
            bin->sub_bins[source_id].have_eos = TRUE;
            break;
        case GST_STATE_CHANGE_FAILURE:
            SPDLOG_LOGGER_ERROR(logger, "STATE CHANGE FAILURE");
            break;
        case GST_STATE_CHANGE_ASYNC:
            SPDLOG_LOGGER_INFO(logger, "STATE CHANGE ASYNC");
            do{
                state_return = gst_element_get_state (elem, NULL, NULL, GST_CLOCK_TIME_NONE);
                sleep(1);
            }while(state_return != GST_STATE_CHANGE_SUCCESS);
            g_snprintf (pad_name, 15, "sink_%u", source_id);
            sinkpad = gst_element_get_static_pad (bin->streammux, pad_name);
            gst_pad_send_event (sinkpad, gst_event_new_flush_stop (FALSE));
            gst_element_release_request_pad (bin->streammux, sinkpad);
            SPDLOG_LOGGER_INFO(logger, "release SUCCESS");
            gst_object_unref (sinkpad);
            gst_bin_remove (GST_BIN (pipeline), elem);
//            source_id--;
//            bin->num_bins--;
            bin->sub_bins[source_id].have_eos = TRUE;
            break;
        case GST_STATE_CHANGE_NO_PREROLL:
            SPDLOG_LOGGER_INFO(logger, "STATE CHANGE NO PREROLL");
            break;
        default:
            break;
    }
}

static gboolean
event_thread_func_2 (gpointer arg)
{
    struct timeval current_time;
    GstStateChangeReturn ret;
    time_t now = time(0);
    tm *ltm = localtime(&now);
    static gboolean relink = FALSE;
    static struct timeval start_time = {};

    SPDLOG_LOGGER_INFO(logger, "Program running fine");
    for (int i = 0; i < num_instances; i++) {
        gboolean need_delete = FALSE;
        GstState state, pending;
        NvDsConfig config = {};
        gettimeofday (&current_time, NULL);
        NvDsSrcParentBin *bin = &appCtx[i]->pipeline.multi_src_bin;
        bool is_config = TRUE;
        if (!parse_source_config_file (&config, cfg_files[i])) {
            is_config = FALSE;
            NVGSTDS_ERR_MSG_V ("Failed to parse config file '%s'", cfg_files[i]);
        }
        ret = gst_element_get_state (bin->streammux, &state, &pending, 0);
        if(state != GST_STATE_PLAYING){
            gettimeofday (&start_time, NULL);
            continue;
        }
        //   正常运行三十秒后,再开始去除流的流程
        if(current_time.tv_sec - start_time.tv_sec < 30){
            continue;
        }
//        g_mutex_lock (&appCtx[i]->app_lock);
        for(int j = 0; j < appCtx[i]->config.num_source_sub_bins; j++)
        {
            // 判断配置文件是否更改
            if(is_config && config.multi_source_config[j].enable != appCtx[i]->config.multi_source_config[j].enable)
            {
                if(config.multi_source_config[j].enable)
                {
                    appCtx[i]->config.multi_source_config[j] = config.multi_source_config[j];
                    add_sources(&appCtx[i]->config.multi_source_config[j], bin, j);
                }
                else
                {
                    SPDLOG_LOGGER_WARN(logger, "config file is change, restart");
                    appCtx[i]->config.multi_source_config[j] = config.multi_source_config[j];
                    bin->sub_bins[j].have_eos = TRUE;
                    need_delete = TRUE;
//                    stop_release_source(bin, j);
                }
            }
            //  流卡顿则删除
            if(!appCtx[i]->config.multi_source_config[j].enable)
                continue;
            gettimeofday (&current_time, NULL);
            gdouble time_diff_msec_since_last_reset =
                1000.0 * (current_time.tv_sec - bin->sub_bins[j].last_buffer_time.tv_sec) +
                    (current_time.tv_usec - bin->sub_bins[j].last_buffer_time.tv_usec) / 1000.0;

            if( bin->sub_bins[j].last_buffer_time.tv_sec != 0 && (time_diff_msec_since_last_reset > 3000) )
            {
                if(!bin->sub_bins[j].have_eos)
                {
                    SPDLOG_LOGGER_WARN(logger, "The stream could not be caught, unlink it, id: {}", bin->sub_bins[j].source_id);
                    appCtx[i]->config.multi_source_config[j].enable = FALSE;
                    bin->sub_bins[j].have_eos = TRUE;
                    need_delete = TRUE;
//                    stop_release_source(bin, j);
                }
            }
            if(bin->sub_bins[j].have_eos && ltm->tm_min == 0 && relink == FALSE)
            {
                // 整点重连
                relink = TRUE;
                add_sources(&appCtx[i]->config.multi_source_config[j], bin, j);
                NVGSTDS_INFO_MSG_V ("relink stream ,id: %d", bin->sub_bins[j].source_id);
            }
            else if(ltm->tm_min != 0 && relink == TRUE)
            {
                relink = FALSE;
            }
        }
        if(need_delete)
        {
            SPDLOG_LOGGER_WARN(logger, "destroy pipeline");
            if(pause_pipeline (appCtx[i]) == FALSE)
            {
                NVGSTDS_ERR_MSG_V ("Failed to pause pipeline");
            }
            destroy_pipeline(appCtx[i]);
            if (!create_pipeline (appCtx[i], bbox_generated_probe_after_analytics,
                                  NULL, perf_cb, overlay_graphics)) {
                NVGSTDS_ERR_MSG_V ("Failed to create pipeline");
                appCtx[i]->return_value = -1;
                appCtx[i]->quit = TRUE;
            }
            if (gst_element_set_state (appCtx[i]->pipeline.pipeline,
                                       GST_STATE_PLAYING) == GST_STATE_CHANGE_FAILURE) {
                NVGSTDS_ERR_MSG_V ("\ncan't set pipeline to playing state.\n");
                appCtx[i]->return_value = -1;
                appCtx[i]->quit = TRUE;
            }
            gettimeofday (&start_time, NULL);
        }
//        g_mutex_unlock (&appCtx[i]->app_lock);
    }
}

If you don’t want to test my code. You can use deepstream-test5-app without my code.
Because destroy_pipeline function also has a chance to trigger a deadlock.
Call destroy_pipeline function in a loop has a chance to trigger a deadlock.
It should be noted that I use RTMP stream. It is recommended that you also use RTMP stream

OK, I’ll try to use deepstream-test5-app with RTMP without any code change s to test the issue.
Since your env maybe has a high probability, could you help to do some verification by the following steps:
1.go to the lib directory

cd /opt/nvidia/deepstream/deepstream-6.xx/lib

2.rename the libnvv4l2.so file

mv  libnvv4l2.so  libnvv4l2.so.bk
  1. run your demo

It will use the avdec_h264 decoder instead of nv decoder. You can see if the issuce can duplicated. Thanks

Hi @yuweiw , I think you meant to write mv libnvv4l2.so libnvv4l2.so.bk instead of cp libnvv4l2.so libnvv4l2.so.bk at step 2.

Yes, I’ll change it to mv.Thanks

1 Like

There is no update from you for a period, assuming this is not an issue anymore.
Hence we are closing this topic. If need further support, please open a new one.
Thanks

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.