Starvation (?) of gstreamer threads

• Hardware Platform: Tesla T4
• DeepStream Version: 5.0
• TensorRT Version: N/A
• NVIDIA GPU Driver Version: 440.95.01
• Issue Type: Question

Hi, I hope there is someone experienced with gstreamer and DeepStream here, as I have really strange issue with my pipeline.

BACKGROUND

The idea is the following - I would like to create a pipeline that takes RTSP h264/h265 stream input, decodes it, crops several ROIs out from the stream, filter the crops on the basis of some knowledge and save JPEGs for only those crops which pass filtering stage.

The pipeline looks like that (RTSP source was cut off for schema clarity, there is one additional tee element used when pipeline is in different mode, but graph explains essential idea):

For src pad of pre_filter_queue_crop_x there is attached a probe and queues are initialised with the following props:

      g_object_set(
            G_OBJECT(crop_context->pre_filter_queue),
            "min-threshold-time", (guint64)(runtime_args->seconds_before_filter_query*GST_SECOND),
            "max-size-time", 0,
            "max-size-buffers", 0,
            "max-size-bytes", 0,
            NULL
        );

I have tried to delete min-threshold-time requirement and my issue still occurs.

The filtering probe is defined as following:

GstPadProbeReturn filtering_probe(
    GstPad *pad,
    GstPadProbeInfo *info,
    gpointer u_data
){
    CropMetadata *crop_metadata = (CropMetadata*)u_data;
    gboolean success = crop_metadata->till_id == 71;
    if(!success){
        g_print(
            "Dropping buffer cam: %u, till: %u, roi: %s, PTS: %lu.\n", 
            crop_metadata->camera_id, 
            crop_metadata->till_id, 
            crop_metadata->roi_type, 
            GST_BUFFER_PTS(buffer)
        );
        return GST_PAD_PROBE_DROP;
    }
    g_print(
        "Match on C side, START %ld, END %ld, DIFF: %ld\n", 
         start, gst_clock_get_time(gst_system_clock_obtain()), 
        gst_clock_get_time(gst_system_clock_obtain()) - start
    );
    return GST_PAD_PROBE_OK;
}

ISSUE DESCRIPTION

When I run the pipeline I can see lots of logs like:

Dropping buffer cam: 27, till: 73, roi: scanning_area, PTS: 261047775.
Dropping buffer cam: 27, till: 72, roi: scanning_area, PTS: 261047775.
Dropping buffer cam: 27, till: 73, roi: scanning_area, PTS: 340978116.
Dropping buffer cam: 27, till: 72, roi: scanning_area, PTS: 340978116.
Dropping buffer cam: 29, till: 77, roi: scanning_area, PTS: 341025061.
Dropping buffer cam: 29, till: 75, roi: scanning_area, PTS: 341025061.
Dropping buffer cam: 29, till: 76, roi: scanning_area, PTS: 341025061.
Dropping buffer cam: 27, till: 73, roi: scanning_area, PTS: 420842949.
Dropping buffer cam: 27, till: 72, roi: scanning_area, PTS: 420842949.
Dropping buffer cam: 29, till: 77, roi: scanning_area, PTS: 420781728.
Dropping buffer cam: 29, till: 75, roi: scanning_area, PTS: 420781728.
Dropping buffer cam: 29, till: 76, roi: scanning_area, PTS: 420781728.
Dropping buffer cam: 27, till: 73, roi: scanning_area, PTS: 500623889.

basically from all tills apart from 71 which should be preserved. But I don’t see logs like:

Match on C side, START 1604911038974641642, END 1604911038974642149, DIFF: 377

while pipeline is up and running, as well as no JPEG image is saved. What I can see is growing size of pre_filter_queue_crop_x associated with till 71. Other pad is attached to its src pad:

GstPadProbeReturn mark_frame_enqueued(
    GstPad *pad,
    GstPadProbeInfo *info,
    gpointer u_data
){
    GstElement *queue = (GstElement *)gst_pad_get_parent(pad);
    
    guint current_buffers, max_buffers, min_buffers, current_bytes, max_bytes;
    guint64 current_times, max_times;
    g_object_get (G_OBJECT(queue),
        "current-level-buffers", &current_buffers,
        "current-level-bytes", &current_bytes,
        "current-level-time", &current_times,
        "max-size-time", &max_times,
        "max-size-buffers", &max_buffers,
        "max-size-bytes", &max_bytes,
        "min-threshold-buffers", &min_buffers,
        NULL
    );
    g_print("[%p] mark_frame_enqueued. Current buffers: %u/%u/%u. Current bytes: %u/%u Current time: %lu/%lu.\n", 
        queue,
        min_buffers, current_buffers, max_buffers,
        current_bytes, max_bytes,
        current_times, max_times
    );
    return GST_PAD_PROBE_OK;
}

filtered logs are like that:

[...]
[0x164e630] mark_frame_enqueued. Current buffers: 0/62/0. Current bytes: 7929552/0 Current time: 5174248535/0.
[0x164e630] mark_frame_enqueued. Current buffers: 0/63/0. Current bytes: 8057448/0 Current time: 5253252980/0.
[0x164e630] mark_frame_enqueued. Current buffers: 0/64/0. Current bytes: 8185344/0 Current time: 5333257390/0.
[0x164e630] mark_frame_enqueued. Current buffers: 0/65/0. Current bytes: 8313240/0 Current time: 5413261764/0.
[0x164e630] mark_frame_enqueued. Current buffers: 0/66/0. Current bytes: 8313240/0 Current time: 5413261764/0.
[...]

and the value of current_buffers is growing.

The moment I can see data saved on disc and logs of this type:

Match on C side, START 1604911038974641642, END 1604911038974642149, DIFF: 377

is when I press CRTL+C and send EOS to the pipeline. Then, crops are starting to be saved and I see bunch of logs in the console.

CONSIDERATIONS

I’ve executed bunch of tests to investigate the issue:

  • When I always return GST_PAD_PROBE_DROP or GST_PAD_PROBE_OK from filtering_probe() everything works as it should, in latter case there is on-line crops saving and almost constant queue size.
  • When you randomly return GST_PAD_PROBE_DROP / GST_PAD_PROBE_OK for all queries, it also works, but there is noticeable delay (when I did it in controllable manner I confirmed, that branches of pipeline initially stuck until all of them receive GST_PAD_PROBE_OK response from filtering_probe() ).

MY ASSUMPTIONS

  • As I read that queue gst element spawns execution thread, I assume that for some reason, threads that receive GST_PAD_PROBE_OK are not scheduled to be executed (confirmed with dumping thread ids while execution), I don’t know why, maybe my understanding of gstreamer is not good enough.

WHAT DO I ASK FOR

I would really appreciate any suggestion from people that have experience with gstreamer / DeepStream. Maybe I just do some stupid mistake I cannot spot. Thanks in advance for anyone that could help :)

Hi,
The pipeline is

.. ! rtph265depay ! tee ! nvv4l2decoder ! tee name=t1 t1. ! queue ! nvvideoconvert ! capsfilter ! queue ! jpegenc ! filesink t1. ! queue ! nvvideoconvert ! capsfilter ! queue ! jpegenc ! filesink t1. ! queue ! nvvideoconvert ! capsfilter ! queue ! jpegenc ! filesink 

You may break down the pipeline to get more clues. And can try with nveglglessink like

.. ! rtph265depay ! tee ! nvv4l2decoder ! nveglglessink

Check if the h265 stream is decoded and shown on screen.

Unfortunately this is not an option for me - I am working on remote server with SSH access only.

Besides, when I do only run my pipeline with one “branch” everything works fine. What is more - multi-branch case works fine when you “flush the pipe” by returning GST_PAD_PROBE_OK from filtering_probe() first time the probe is fired for a particular pre_filter_queue_crop_x regardless of your filtering logic.

Hi,
If it is not possible to check video playback, please try to dump YUV frames:

.. ! rtph265depay ! tee ! nvv4l2decoder ! nvvideoconvert ! video/x-raw ! multifilesink location=/tmp/dump_%05d.yuv

and check if the YUV frames are correct.

If there is no filtering probe, data flow through the pipeline is correct. I will try to prepare minimal representative example of code (I believe that on gst-launch it will not be possible to reproduce the issue).

Hi, I’ve managed to create sample of code that reproduce issue with use of filesrc, so that everyone can run the code. I’ve prepared a Dockerfile and Makefile to make it easier to try on your side. Please find repository attached: GstreamerIssueExample.zip (25.5 KB)

GST issue - minimal example

Build Docker image


repo_root$ docker build -t gstreamer_issue .

Run container


docker run --gpus all -it gstreamer_issue

Run example inside container


(container) /code$ ./fetch_data.sh

(container) /code$ ./app [mode]

Example:
(container) /code$ ./app 0

mode have two valid options:

  • 0 - unfair crops filtering - this is the essence of the issue - in this mode all crops from branch 1 will be dropped and from branch 2 - saved. We can observe that branch 2 is stucked.

  • 1 - random filtering - we can observe that one branch is stucked until both branches receives GST_PAD_PROBE_OK.

The pipeline that is created looks like that:

So:

  • using mode=0 we can run the problematic case
  • using mode=1 we can run the case that quite works.

I run the code on my end and here are logs:

What we can see in logs:

Mode 0

MODE: 0
Ready state...
Running...
[Branch 1] monitoring_probe. Current buffers: 0/0/0. Current bytes: 0/0 Current time: 0/0.
UNFAIR FILTERING, branch 1, crop drop.
[Branch 2] monitoring_probe. Current buffers: 0/0/0. Current bytes: 0/0 Current time: 0/0.
UNFAIR FILTERING, branch 2, crop pass.
// From this moment we can only see:
[...]
UNFAIR FILTERING, branch 1, crop drop.
[Branch 2] monitoring_probe. Current buffers: 0/9/0. Current bytes: 307800/0 Current time: 375375000/0.
[Branch 1] monitoring_probe. Current buffers: 0/0/0. Current bytes: 0/0 Current time: 0/0.
UNFAIR FILTERING, branch 1, crop drop.
[Branch 2] monitoring_probe. Current buffers: 0/10/0. Current bytes: 342000/0 Current time: 417083333/0.
[Branch 1] monitoring_probe. Current buffers: 0/0/0. Current bytes: 0/0 Current time: 0/0.
[...]
UNFAIR FILTERING, branch 1, crop drop.
[Branch 2] monitoring_probe. Current buffers: 0/2043/0. Current bytes: 69870600/0 Current time: 85210125000/0.
[Branch 1] monitoring_probe. Current buffers: 0/0/0. Current bytes: 0/0 Current time: 0/0.
UNFAIR FILTERING, branch 1, crop drop.
[Branch 2] monitoring_probe. Current buffers: 0/2044/0. Current bytes: 69904800/0 Current time: 85251833333/0.
[Branch 1] monitoring_probe. Current buffers: 0/0/0. Current bytes: 0/0 Current time: 0/0.
UNFAIR FILTERING, branch 1, crop drop.
[Branch 2] monitoring_probe. Current buffers: 0/2045/0. Current bytes: 69939000/0 Current time: 85293541666/0.
[Branch 1] monitoring_probe. Current buffers: 0/0/0. Current bytes: 0/0 Current time: 0/0.
UNFAIR FILTERING, branch 1, crop drop.
[Branch 2] monitoring_probe. Current buffers: 0/2046/0. Current bytes: 69973200/0 Current time: 85335250000/0.
[Branch 1] monitoring_probe. Current buffers: 0/0/0. Current bytes: 0/0 Current time: 0/0.
UNFAIR FILTERING, branch 1, crop drop.
[Branch 2] monitoring_probe. Current buffers: 0/2047/0. Current bytes: 70007400/0 Current time: 85376958333/0.
[Branch 1] monitoring_probe. Current buffers: 0/0/0. Current bytes: 0/0 Current time: 0/0.
UNFAIR FILTERING, branch 1, crop drop.
[Branch 2] monitoring_probe. Current buffers: 0/2048/0. Current bytes: 70041600/0 Current time: 85418666666/0.
[Branch 1] monitoring_probe. Current buffers: 0/0/0. Current bytes: 0/0 Current time: 0/0.
UNFAIR FILTERING, branch 1, crop drop.
// Branch 2 buffer is linearly growing....
[...]
// And at the very end:
[Branch 2] monitoring_probe. Current buffers: 0/2134/0. Current bytes: 72982800/0 Current time: 89005583333/0.
[Branch 1] Crop saved: data/branch_2_0.jpeg
UNFAIR FILTERING, branch 2, crop pass.
UNFAIR FILTERING, branch 2, crop pass.
[Branch 1] Crop saved: data/branch_2_1.jpeg
UNFAIR FILTERING, branch 2, crop pass.
[Branch 1] Crop saved: data/branch_2_2.jpeg
UNFAIR FILTERING, branch 2, crop pass.
[Branch 1] Crop saved: data/branch_2_3.jpeg
UNFAIR FILTERING, branch 2, crop pass.
[Branch 1] Crop saved: data/branch_2_4.jpeg
UNFAIR FILTERING, branch 2, crop pass.
[Branch 1] Crop saved: data/branch_2_5.jpeg
UNFAIR FILTERING, branch 2, crop pass.
[Branch 1] Crop saved: data/branch_2_6.jpeg
UNFAIR FILTERING, branch 2, crop pass.
[Branch 1] Crop saved: data/branch_2_7.jpeg
UNFAIR FILTERING, branch 2, crop pass.
[Branch 1] Crop saved: data/branch_2_8.jpeg
UNFAIR FILTERING, branch 2, crop pass.
[Branch 1] Crop saved: data/branch_2_9.jpeg
UNFAIR FILTERING, branch 2, crop pass.
[...]
// up to the end:
[Branch 1] Crop saved: data/branch_2_2121.jpeg
UNFAIR FILTERING, branch 2, crop pass.
[Branch 1] Crop saved: data/branch_2_2122.jpeg
UNFAIR FILTERING, branch 2, crop pass.
[Branch 1] Crop saved: data/branch_2_2123.jpeg
UNFAIR FILTERING, branch 2, crop pass.
[Branch 1] Crop saved: data/branch_2_2124.jpeg
UNFAIR FILTERING, branch 2, crop pass.
[Branch 1] Crop saved: data/branch_2_2125.jpeg
UNFAIR FILTERING, branch 2, crop pass.
[Branch 1] Crop saved: data/branch_2_2126.jpeg
UNFAIR FILTERING, branch 2, crop pass.
[Branch 1] Crop saved: data/branch_2_2127.jpeg
UNFAIR FILTERING, branch 2, crop pass.
[Branch 1] Crop saved: data/branch_2_2128.jpeg
UNFAIR FILTERING, branch 2, crop pass.
[Branch 1] Crop saved: data/branch_2_2129.jpeg
UNFAIR FILTERING, branch 2, crop pass.
[Branch 1] Crop saved: data/branch_2_2130.jpeg
UNFAIR FILTERING, branch 2, crop pass.
[Branch 1] Crop saved: data/branch_2_2131.jpeg
UNFAIR FILTERING, branch 2, crop pass.
[Branch 1] Crop saved: data/branch_2_2132.jpeg
UNFAIR FILTERING, branch 2, crop pass.
[Branch 1] Crop saved: data/branch_2_2133.jpeg
UNFAIR FILTERING, branch 2, crop pass.
[Branch 1] Crop saved: data/branch_2_2134.jpeg
[Branch 1] Crop saved: data/branch_2_2135.jpeg

Please ignore wrong branch id in “Crop saved” message: [Branch 1] Crop saved: data/branch_2_2135.jpeg, I’ve hard-coded 1 on the level of creating log (already corrected in the code I’ve provided), in this case branch id encoded in crop path is correct!

Mode 1

Here we can observe (not strictly) but quite uniform distribution of filtering probe activity :)

Hope that now everything is clear :)

Hi,
Will you have nvstreammux and nvinfer plugins to run deep learning inference? For using DeepStream SDK, we would suggest do deep learning inference. Somehow you don’t have this part in the pipeline.

Thanks for reply. I must say that I do not use this pipeline to make any Deep Learning inference. I rather use the pipeline for video processing.

Hi,
So the issue is not seen if you always return GST_PAD_PROBE_OK in filtering_probe()? Not sure if it is related but we have seen an issue about adding prob functions:
Doesn't work nvv4l2decoder in deep-stream-python-apps

Another suspicion is about synchronization mechanism in gstreamer. If you return GST_PAD_PROBE_DROP irregularly, the downstream elements receive buffers with random-interval timestamps, and may trigger the issue. Please set sync=false to the two multifilesink and try again.

Hi,
Please refer to attached sample and check if you can change to the pipelines. It sends decoded frames to two appsink elements and you can decide to encode or drop the frames in the callback.

gst_jpg_on_demand_1.zip (1.7 KB)