Gstreamer pipeline hangs duration termination

Hi,

I’m trying to run this inference pipeline on an Orin Nano device using Python (simplified the content in appsink from our real use while reproducing the issue), the process hangs during termination and the pipeline cannot exit gracefully.

The Python script:

import os
import gi
gi.require_version('Gst', '1.0')
from gi.repository import Gst, GLib

os.environ["USE_NEW_NVSTREAMMUX"] = "yes"

RTSP_URL = "rtsp://<username>:<password>@<ip>:<port>"
ENGINE_PATH = "<yolo_engine_file_path>"
CONFIG_PATH = "<detector_config_file_path>"
TRACKER_CONFIG_PATH = "<tracker_config_file_path>"  # config content same as https://docs.nvidia.com/metropolis/deepstream/dev-guide/text/DS_plugin_gst-nvtracker.html#trafficcamnet-nvsort

Gst.init(None)

pipeline_description = """
rtspsrc drop-on-latency=True latency=3000 protocols=tcp timeout=0 tcp-timeout=0 teardown-timeout=0 !
rtph264depay ! h264parse ! tee ! queue ! decodebin ! tee ! m.sink_0 nvstreammux name=m batch-size=1 sync-inputs=True !
queue ! nvvideoconvert ! video/x-raw(memory:NVMM),format=(string)RGBA ! queue !
nvinfer batch-size=1 !
nvtracker ll-lib-file=/opt/nvidia/deepstream/deepstream/lib/libnvds_nvmultiobjecttracker.so gpu-id=0 display-tracking-id=1 !
queue ! tee ! appsink emit-signals=True
"""

pipeline = Gst.parse_launch(pipeline_description)

def bus_call(bus, message, loop):
    t = message.type
    if t == Gst.MessageType.EOS:
        print("End-of-stream")
        pipeline.set_state(Gst.State.NULL)
        loop.quit()
    elif t == Gst.MessageType.ERROR:
        err, debug = message.parse_error()
        print(f"Error: {err}, {debug}")
        pipeline.set_state(Gst.State.NULL)
        loop.quit()
    return True

def callback(sink):
    sample = sink.emit("pull-sample")
    return Gst.FlowReturn.OK

rtspsrc = pipeline.get_by_name("rtspsrc0")
rtspsrc.set_property("location", RTSP_URL)

nvinfer = pipeline.get_by_name("nvinfer0")
nvinfer.set_property("model-engine-file", ENGINE_PATH)
nvinfer.set_property("config-file-path", CONFIG_PATH)

nvtracker = pipeline.get_by_name("nvtracker0")
nvtracker.set_property("ll-config-file", TRACKER_CONFIG_PATH)

sink = pipeline.get_by_name("appsink0")
sink.connect("new-sample", callback)

loop = GLib.MainLoop()

bus = pipeline.get_bus()
bus.add_signal_watch()
bus.connect("message", bus_call, loop)

pipeline.set_state(Gst.State.PLAYING)

try:
    loop.run()
except:
    pass

pipeline.set_state(Gst.State.NULL)

After running the script for a short while and then pressing ctrl+C, the gst log shows it tries to change the element states to terminate the pipeline, and then:

0:00:22.452147830 11861 0xaaab12b97d80 INFO              GST_STATES gstbin.c:2928:gst_bin_change_state_func:<manager> child 'rtpsession0' changed state to 3(PAUSED) successfully
0:00:22.452163062 11861 0xaaab12b97d80 INFO              GST_STATES gstelement.c:2806:gst_element_continue_state:<manager> completed state change to PAUSED
0:00:22.452176246 11861 0xaaab12b97d80 INFO              GST_STATES gstelement.c:2706:_priv_gst_element_state_changed:<manager> notifying about state-changed PLAYING to PAUSED (VOID_PENDING pending)
0:00:22.453182739 11861 0xaaab134baf00 INFO                    task gsttask.c:368:gst_task_func:<queue2:src> Task going to paused
0:00:22.461647272 11861 0xaaab134baf60 INFO                    task gsttask.c:368:gst_task_func:<queue1:src> Task going to paused
0:00:22.468888409 11861 0xffff0c006920 ERROR                default gstnvstreammux_pads.cpp:342:push:<m> push failed [-2]
0:00:22.475457079 11861 0xffff0c006920 ERROR                default gstnvstreammux_pads.cpp:342:push:<m> push failed [-2]

After around twenty outputs about nvstreammux “push failed”, it stuck at:

0:00:22.590386482 11861 0xffff0c006920 ERROR                default gstnvstreammux_pads.cpp:342:push:<m> push failed [-2]
0:00:22.594176991 11861 0xffff0c006920 ERROR                default gstnvstreammux_pads.cpp:342:push:<m> push failed [-2]
0:01:47.288580738 11861 0xaaab12b97d80 WARN                 rtspsrc gstrtspsrc.c:5734:gst_rtspsrc_loop_interleaved:<rtspsrc0> warning: The server closed the connection.
0:01:47.288759751 11861 0xaaab12b97d80 INFO        GST_ERROR_SYSTEM gstelement.c:2271:gst_element_message_full_with_details:<rtspsrc0> posting message: Could not read from resource.
0:01:47.288845897 11861 0xaaab12b97d80 INFO        GST_ERROR_SYSTEM gstelement.c:2298:gst_element_message_full_with_details:<rtspsrc0> posted warning message: Could not read from resource.
0:01:47.289068944 11861 0xaaab12b97d80 INFO                    task gsttask.c:368:gst_task_func:<task0> Task going to paused

and stays hanging without termination.

The testing environment is as follows:

  • Hardware Platform: Jetson Orin Nano 4G
  • DeepStream Version: 7.0
  • JetPack: 6.0
  • L4T: 36.3.0
  • TensorRT: 8.6.2.3
  • CUDA: 12.2.140
  • VPI: 3.1.5
  • Gstreamer: 1.20.3

Some experiments and tests we have done:

  1. It is tested that by replacing appsink with fakesink in the Python script, it can terminate gracefully. The gst-launch-1.0 command using fakesink with the same pipeline can also terminate normally:
USE_NEW_NVSTREAMMUX=yes gst-launch-1.0 -e -v rtspsrc location=<rtsp_url> drop-on-latency=True latency=3000 protocols=tcp timeout=0 tcp-timeout=0 teardown-timeout=0 ! rtph264depay ! h264parse ! tee ! queue ! decodebin ! tee ! m.sink_0 nvstreammux name=m batch-size=1 sync-inputs=True ! queue ! nvvideoconvert ! 'video/x-raw(memory:NVMM),format=(string)RGBA' ! queue ! nvinfer batch-size=1 model-engine-file=<engine_path> config-file-path=<config_path> ! nvtracker ll-lib-file=/opt/nvidia/deepstream/deepstream/lib/libnvds_nvmultiobjecttracker.so ll-config-file=<tracker_config> gpu-id=0 display-tracking-id=1 ! queue ! tee ! fakesink sync=True async=False enable-last-sample=False

but there seems to be no easy way to test appsink from gst-launch-1.0 command directly.

  1. It is also tested that if removing the nvtracker element, the Python script can also terminate normally, which leads to suspicion that the cause might lie somewhere in the new release in nvtracker or appsink.

  2. The same Python script can run and terminate without hanging in an older testing environment we had previously:

  • Hardware Platform: Jetson Orin Nano 4G
  • DeepStream Version: 6.2
  • JetPack: 5.1.1
  • L4T: 35.3.1
  • TensorRT: 8.5.2.2
  • CUDA: 11.4.315
  • VPI: 2.2.7
  • Gstreamer: 1.16.3

Please kindly let me know what other information I can provide, and what else I can do to fix the issue. Thank you for your support!

Hi,
It looks like you don’t send EoS in termination. Please add it and see if it works.

May refer to the samples:
Synchronizing audio and video after inference of RTSP streams - #5 by DaneLLL
Nvv4l2decoder sometimes fails to negotiate with downstream after several pipeline re-launches - #16 by DaneLLL

Sending EOS events still hangs.

  1. If I keep the loop.run(), it doesn’t work with time.sleep, so I used GLib.timeout_add_seconds to send the EOS event. The modified lower part of the script now look like:
# Create a GLib MainLoop
loop = GLib.MainLoop()
 
# Add a bus watch to the pipeline
bus = pipeline.get_bus()
bus.add_signal_watch()
bus.connect("message", bus_call, loop)
 
# Start playing the pipeline
pipeline.set_state(Gst.State.PLAYING)
 
def send_eos():
    print("Sending EOS event")
    Gst.Element.send_event(pipeline, Gst.Event.new_eos())
 
GLib.timeout_add_seconds(20, send_eos)
 
try:
    # Run the loop
    loop.run()
except:
    pass
 
# Clean up
pipeline.set_state(Gst.State.NULL)

After the pipeline runs for 20 seconds, the timeout gets triggered and the terminal output shows:

Sending EOS event
End-of-stream
[NvMultiObjectTracker] De-initialized

The gst log shows:

...
0:00:32.793477260 12538 0xaaaaeadc6860 INFO              GST_STATES gstbin.c:2928:gst_bin_change_state_func:<manager> child 'rtpsession0' changed state to 3(PAUSED) successfully
0:00:32.793519181 12538 0xaaaaeadc6860 INFO              GST_STATES gstelement.c:2806:gst_element_continue_state:<manager> completed state change to PAUSED
0:00:32.793537870 12538 0xaaaaeadc6860 INFO              GST_STATES gstelement.c:2706:_priv_gst_element_state_changed:<manager> notifying about state-changed PLAYING to PAUSED (VOID_PENDING pending)
0:01:57.542820457 12538 0xaaaaeadc6860 WARN                 rtspsrc gstrtspsrc.c:5734:gst_rtspsrc_loop_interleaved:<rtspsrc0> warning: The server closed the connection.
0:01:57.543032751 12538 0xaaaaeadc6860 INFO        GST_ERROR_SYSTEM gstelement.c:2271:gst_element_message_full_with_details:<rtspsrc0> posting message: Could not read from resource.
0:01:57.543123538 12538 0xaaaaeadc6860 INFO        GST_ERROR_SYSTEM gstelement.c:2298:gst_element_message_full_with_details:<rtspsrc0> posted warning message: Could not read from resource.
0:01:57.543186227 12538 0xaaaaeadc6860 INFO                    task gsttask.c:368:gst_task_func:<task0> Task going to paused

and then the program hangs as before, does not terminate.

The full gst log is attached:
gst-simplescript-element-sendeos-JP6.log (216.6 KB)

  1. I also tried without loop (which is less preferably in our practical use) but the hanging behavior remains, the lower part other python script is modified to this:
# Start playing the pipeline
pipeline.set_state(Gst.State.PLAYING)

time.sleep(10)
print("Sending EOS event")
Gst.Element.send_event(pipeline, Gst.Event.new_eos())

# Clean up
pipeline.set_state(Gst.State.NULL)

After running for 10 seconds the terminal output prints out the “Sending EOS event” message, but then it still hangs during termination, and the gst log shows:

0:00:29.247558218 12604 0xaaaab9a50460 INFO              GST_STATES gstbin.c:2928:gst_bin_change_state_func:<manager> child 'rtpsession0' changed state to 3(PAUSED) successfully
0:00:29.247574795 12604 0xaaaab9a50460 INFO              GST_STATES gstelement.c:2806:gst_element_continue_state:<manager> completed state change to PAUSED
0:00:29.247586667 12604 0xaaaab9a50460 INFO              GST_STATES gstelement.c:2706:_priv_gst_element_state_changed:<manager> notifying about state-changed PLAYING to PAUSED (VOID_PENDING pending)
0:00:29.252025354 12604 0xffff04007000 ERROR                default gstnvstreammux_pads.cpp:342:push:<m> push failed [-2]
0:00:29.254209672 12604 0xaaaab9a50460 INFO                    task gsttask.c:368:gst_task_func:<task0> Task going to paused
0:00:29.263882300 12604 0xffff04007000 ERROR                default gstnvstreammux_pads.cpp:342:push:<m> push failed [-2]
0:00:29.273124419 12604 0xffff04007000 ERROR                default gstnvstreammux_pads.cpp:342:push:<m> push failed [-2]

with more streammux push fail message.

Please advise what are some other things to try, thank you!

Hi,
Please share a full sample app and steps . So that we can set up Orin Nano developer kit to reproduce the issue and check/

I sent the scripts through direct message, please let me know if they’re received and the issue could be reproduced. Thank you!