Gstreamer pipeline hangs duration termination

Hi,

I’m trying to run this inference pipeline on an Orin Nano device using Python (simplified the content in appsink from our real use while reproducing the issue), the process hangs during termination and the pipeline cannot exit gracefully.

The Python script:

import os
import gi
gi.require_version('Gst', '1.0')
from gi.repository import Gst, GLib

os.environ["USE_NEW_NVSTREAMMUX"] = "yes"

RTSP_URL = "rtsp://<username>:<password>@<ip>:<port>"
ENGINE_PATH = "<yolo_engine_file_path>"
CONFIG_PATH = "<detector_config_file_path>"
TRACKER_CONFIG_PATH = "<tracker_config_file_path>"  # config content same as https://docs.nvidia.com/metropolis/deepstream/dev-guide/text/DS_plugin_gst-nvtracker.html#trafficcamnet-nvsort

Gst.init(None)

pipeline_description = """
rtspsrc drop-on-latency=True latency=3000 protocols=tcp timeout=0 tcp-timeout=0 teardown-timeout=0 !
rtph264depay ! h264parse ! tee ! queue ! decodebin ! tee ! m.sink_0 nvstreammux name=m batch-size=1 sync-inputs=True !
queue ! nvvideoconvert ! video/x-raw(memory:NVMM),format=(string)RGBA ! queue !
nvinfer batch-size=1 !
nvtracker ll-lib-file=/opt/nvidia/deepstream/deepstream/lib/libnvds_nvmultiobjecttracker.so gpu-id=0 display-tracking-id=1 !
queue ! tee ! appsink emit-signals=True
"""

pipeline = Gst.parse_launch(pipeline_description)

def bus_call(bus, message, loop):
    t = message.type
    if t == Gst.MessageType.EOS:
        print("End-of-stream")
        pipeline.set_state(Gst.State.NULL)
        loop.quit()
    elif t == Gst.MessageType.ERROR:
        err, debug = message.parse_error()
        print(f"Error: {err}, {debug}")
        pipeline.set_state(Gst.State.NULL)
        loop.quit()
    return True

def callback(sink):
    sample = sink.emit("pull-sample")
    return Gst.FlowReturn.OK

rtspsrc = pipeline.get_by_name("rtspsrc0")
rtspsrc.set_property("location", RTSP_URL)

nvinfer = pipeline.get_by_name("nvinfer0")
nvinfer.set_property("model-engine-file", ENGINE_PATH)
nvinfer.set_property("config-file-path", CONFIG_PATH)

nvtracker = pipeline.get_by_name("nvtracker0")
nvtracker.set_property("ll-config-file", TRACKER_CONFIG_PATH)

sink = pipeline.get_by_name("appsink0")
sink.connect("new-sample", callback)

loop = GLib.MainLoop()

bus = pipeline.get_bus()
bus.add_signal_watch()
bus.connect("message", bus_call, loop)

pipeline.set_state(Gst.State.PLAYING)

try:
    loop.run()
except:
    pass

pipeline.set_state(Gst.State.NULL)

After running the script for a short while and then pressing ctrl+C, the gst log shows it tries to change the element states to terminate the pipeline, and then:

0:00:22.452147830 11861 0xaaab12b97d80 INFO              GST_STATES gstbin.c:2928:gst_bin_change_state_func:<manager> child 'rtpsession0' changed state to 3(PAUSED) successfully
0:00:22.452163062 11861 0xaaab12b97d80 INFO              GST_STATES gstelement.c:2806:gst_element_continue_state:<manager> completed state change to PAUSED
0:00:22.452176246 11861 0xaaab12b97d80 INFO              GST_STATES gstelement.c:2706:_priv_gst_element_state_changed:<manager> notifying about state-changed PLAYING to PAUSED (VOID_PENDING pending)
0:00:22.453182739 11861 0xaaab134baf00 INFO                    task gsttask.c:368:gst_task_func:<queue2:src> Task going to paused
0:00:22.461647272 11861 0xaaab134baf60 INFO                    task gsttask.c:368:gst_task_func:<queue1:src> Task going to paused
0:00:22.468888409 11861 0xffff0c006920 ERROR                default gstnvstreammux_pads.cpp:342:push:<m> push failed [-2]
0:00:22.475457079 11861 0xffff0c006920 ERROR                default gstnvstreammux_pads.cpp:342:push:<m> push failed [-2]

After around twenty outputs about nvstreammux “push failed”, it stuck at:

0:00:22.590386482 11861 0xffff0c006920 ERROR                default gstnvstreammux_pads.cpp:342:push:<m> push failed [-2]
0:00:22.594176991 11861 0xffff0c006920 ERROR                default gstnvstreammux_pads.cpp:342:push:<m> push failed [-2]
0:01:47.288580738 11861 0xaaab12b97d80 WARN                 rtspsrc gstrtspsrc.c:5734:gst_rtspsrc_loop_interleaved:<rtspsrc0> warning: The server closed the connection.
0:01:47.288759751 11861 0xaaab12b97d80 INFO        GST_ERROR_SYSTEM gstelement.c:2271:gst_element_message_full_with_details:<rtspsrc0> posting message: Could not read from resource.
0:01:47.288845897 11861 0xaaab12b97d80 INFO        GST_ERROR_SYSTEM gstelement.c:2298:gst_element_message_full_with_details:<rtspsrc0> posted warning message: Could not read from resource.
0:01:47.289068944 11861 0xaaab12b97d80 INFO                    task gsttask.c:368:gst_task_func:<task0> Task going to paused

and stays hanging without termination.

The testing environment is as follows:

  • Hardware Platform: Jetson Orin Nano 4G
  • DeepStream Version: 7.0
  • JetPack: 6.0
  • L4T: 36.3.0
  • TensorRT: 8.6.2.3
  • CUDA: 12.2.140
  • VPI: 3.1.5
  • Gstreamer: 1.20.3

Some experiments and tests we have done:

  1. It is tested that by replacing appsink with fakesink in the Python script, it can terminate gracefully. The gst-launch-1.0 command using fakesink with the same pipeline can also terminate normally:
USE_NEW_NVSTREAMMUX=yes gst-launch-1.0 -e -v rtspsrc location=<rtsp_url> drop-on-latency=True latency=3000 protocols=tcp timeout=0 tcp-timeout=0 teardown-timeout=0 ! rtph264depay ! h264parse ! tee ! queue ! decodebin ! tee ! m.sink_0 nvstreammux name=m batch-size=1 sync-inputs=True ! queue ! nvvideoconvert ! 'video/x-raw(memory:NVMM),format=(string)RGBA' ! queue ! nvinfer batch-size=1 model-engine-file=<engine_path> config-file-path=<config_path> ! nvtracker ll-lib-file=/opt/nvidia/deepstream/deepstream/lib/libnvds_nvmultiobjecttracker.so ll-config-file=<tracker_config> gpu-id=0 display-tracking-id=1 ! queue ! tee ! fakesink sync=True async=False enable-last-sample=False

but there seems to be no easy way to test appsink from gst-launch-1.0 command directly.

  1. It is also tested that if removing the nvtracker element, the Python script can also terminate normally, which leads to suspicion that the cause might lie somewhere in the new release in nvtracker or appsink.

  2. The same Python script can run and terminate without hanging in an older testing environment we had previously:

  • Hardware Platform: Jetson Orin Nano 4G
  • DeepStream Version: 6.2
  • JetPack: 5.1.1
  • L4T: 35.3.1
  • TensorRT: 8.5.2.2
  • CUDA: 11.4.315
  • VPI: 2.2.7
  • Gstreamer: 1.16.3

Please kindly let me know what other information I can provide, and what else I can do to fix the issue. Thank you for your support!

Hi,
It looks like you don’t send EoS in termination. Please add it and see if it works.

May refer to the samples:
Synchronizing audio and video after inference of RTSP streams - #5 by DaneLLL
Nvv4l2decoder sometimes fails to negotiate with downstream after several pipeline re-launches - #16 by DaneLLL

Sending EOS events still hangs.

  1. If I keep the loop.run(), it doesn’t work with time.sleep, so I used GLib.timeout_add_seconds to send the EOS event. The modified lower part of the script now look like:
# Create a GLib MainLoop
loop = GLib.MainLoop()
 
# Add a bus watch to the pipeline
bus = pipeline.get_bus()
bus.add_signal_watch()
bus.connect("message", bus_call, loop)
 
# Start playing the pipeline
pipeline.set_state(Gst.State.PLAYING)
 
def send_eos():
    print("Sending EOS event")
    Gst.Element.send_event(pipeline, Gst.Event.new_eos())
 
GLib.timeout_add_seconds(20, send_eos)
 
try:
    # Run the loop
    loop.run()
except:
    pass
 
# Clean up
pipeline.set_state(Gst.State.NULL)

After the pipeline runs for 20 seconds, the timeout gets triggered and the terminal output shows:

Sending EOS event
End-of-stream
[NvMultiObjectTracker] De-initialized

The gst log shows:

...
0:00:32.793477260 12538 0xaaaaeadc6860 INFO              GST_STATES gstbin.c:2928:gst_bin_change_state_func:<manager> child 'rtpsession0' changed state to 3(PAUSED) successfully
0:00:32.793519181 12538 0xaaaaeadc6860 INFO              GST_STATES gstelement.c:2806:gst_element_continue_state:<manager> completed state change to PAUSED
0:00:32.793537870 12538 0xaaaaeadc6860 INFO              GST_STATES gstelement.c:2706:_priv_gst_element_state_changed:<manager> notifying about state-changed PLAYING to PAUSED (VOID_PENDING pending)
0:01:57.542820457 12538 0xaaaaeadc6860 WARN                 rtspsrc gstrtspsrc.c:5734:gst_rtspsrc_loop_interleaved:<rtspsrc0> warning: The server closed the connection.
0:01:57.543032751 12538 0xaaaaeadc6860 INFO        GST_ERROR_SYSTEM gstelement.c:2271:gst_element_message_full_with_details:<rtspsrc0> posting message: Could not read from resource.
0:01:57.543123538 12538 0xaaaaeadc6860 INFO        GST_ERROR_SYSTEM gstelement.c:2298:gst_element_message_full_with_details:<rtspsrc0> posted warning message: Could not read from resource.
0:01:57.543186227 12538 0xaaaaeadc6860 INFO                    task gsttask.c:368:gst_task_func:<task0> Task going to paused

and then the program hangs as before, does not terminate.

The full gst log is attached:
gst-simplescript-element-sendeos-JP6.log (216.6 KB)

  1. I also tried without loop (which is less preferably in our practical use) but the hanging behavior remains, the lower part other python script is modified to this:
# Start playing the pipeline
pipeline.set_state(Gst.State.PLAYING)

time.sleep(10)
print("Sending EOS event")
Gst.Element.send_event(pipeline, Gst.Event.new_eos())

# Clean up
pipeline.set_state(Gst.State.NULL)

After running for 10 seconds the terminal output prints out the “Sending EOS event” message, but then it still hangs during termination, and the gst log shows:

0:00:29.247558218 12604 0xaaaab9a50460 INFO              GST_STATES gstbin.c:2928:gst_bin_change_state_func:<manager> child 'rtpsession0' changed state to 3(PAUSED) successfully
0:00:29.247574795 12604 0xaaaab9a50460 INFO              GST_STATES gstelement.c:2806:gst_element_continue_state:<manager> completed state change to PAUSED
0:00:29.247586667 12604 0xaaaab9a50460 INFO              GST_STATES gstelement.c:2706:_priv_gst_element_state_changed:<manager> notifying about state-changed PLAYING to PAUSED (VOID_PENDING pending)
0:00:29.252025354 12604 0xffff04007000 ERROR                default gstnvstreammux_pads.cpp:342:push:<m> push failed [-2]
0:00:29.254209672 12604 0xaaaab9a50460 INFO                    task gsttask.c:368:gst_task_func:<task0> Task going to paused
0:00:29.263882300 12604 0xffff04007000 ERROR                default gstnvstreammux_pads.cpp:342:push:<m> push failed [-2]
0:00:29.273124419 12604 0xffff04007000 ERROR                default gstnvstreammux_pads.cpp:342:push:<m> push failed [-2]

with more streammux push fail message.

Please advise what are some other things to try, thank you!

Hi,
Please share a full sample app and steps . So that we can set up Orin Nano developer kit to reproduce the issue and check/

I sent the scripts through direct message, please let me know if they’re received and the issue could be reproduced. Thank you!

Hi,

We would like to test your app on the newer software release.

Could you share the plugin library and engine that built on JetPack 6.1?
Or the plugin source code and the onnx model so we can generate it for different software version?

Thanks.

The plugin source codes and the onnx model are sent in DM. Thank you.

Hi,

We test your script on Orin Nano/JetPack 6.1/Deepstream7.1.
The app can terminate immediately when I type ctrl+c.

$ time python3 gst_script.py 
max_fps_dur 8.33333e+06 min_fps_dur 2e+08
gstnvtracker: Loading low-level lib at /opt/nvidia/deepstream/deepstream/lib/libnvds_nvmultiobjecttracker.so
[NvMultiObjectTracker] Initialized
Setting min object dimensions as 16x16 instead of 1x1 to support VIC compute mode.
0:00:00.312497564 24793 0xaaab12e72700 INFO                 nvinfer gstnvinfer.cpp:684:gst_nvinfer_logger:<nvinfer0> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:2092> [UID = 1]: deserialized trt engine from :./yolo_nano.engine
Implicit layer support has been deprecated
INFO: [Implicit Engine Info]: layers num: 0

0:00:00.312606942 24793 0xaaab12e72700 INFO                 nvinfer gstnvinfer.cpp:684:gst_nvinfer_logger:<nvinfer0> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2195> [UID = 1]: Use deserialized engine model: ./yolo_nano.engine
0:00:00.319709343 24793 0xaaab12e72700 INFO                 nvinfer gstnvinfer_impl.cpp:343:notifyLoadModelStatus:<nvinfer0> [UID 1]: Load new model:./config_car_detector.config sucessfully
Opening in BLOCKING MODE 
NvMMLiteOpen : Block : BlockType = 261 
NvMMLiteBlockCreate : Block : BlockType = 261 
max_fps_dur 8.33333e+06 min_fps_dur 2e+08
^C
real    0m8.982s
user    0m1.358s
sys    0m0.746s

Thanks.

1 Like

Hi,

Please also check the private message.
We have shared the change in plugin source for JetPack 6.1.

Thanks.

1 Like

Hi,

Could you share a sample video with us?

If the issue comes from the tracker side, this might be a data-dependent issue.
A video with several target objects detected should help us clarify this.

Thanks.

The sample video is sent in DM.

Thank you for the provided patch in DM for JetPack 6.1, we will start testing it once we can upgrade the system from 6.0 to 6.1. Currently, when following instructions in How to Install and Configure JetPack SDK — JetPack 6.2 documentation and Software Packages and the Update Mechanism — NVIDIA Jetson Linux Developer Guide 1 documentation, our device will lose internet connection and can’t move on with testing.

I wonder, is it possible to provide a solution for JetPack 6.0 and JetPack 6.2? We may consider directly upgrading to 6.2 with the extra fixes included in the release.

Thank you very much for your support.

Hi,

Thanks for sharing the video.
The app can also terminated immediately with the video you provided.

$ time python3 gst_script.py 
max_fps_dur 8.33333e+06 min_fps_dur 2e+08
gstnvtracker: Loading low-level lib at /opt/nvidia/deepstream/deepstream/lib/libnvds_nvmultiobjecttracker.so
[NvMultiObjectTracker] Initialized
Setting min object dimensions as 16x16 instead of 1x1 to support VIC compute mode.
0:00:00.298672678 27186 0xaaaadb1e4100 INFO                 nvinfer gstnvinfer.cpp:684:gst_nvinfer_logger:<nvinfer0> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:2092> [UID = 1]: deserialized trt engine from :./yolo_nano.engine
Implicit layer support has been deprecated
INFO: [Implicit Engine Info]: layers num: 0

0:00:00.298772841 27186 0xaaaadb1e4100 INFO                 nvinfer gstnvinfer.cpp:684:gst_nvinfer_logger:<nvinfer0> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2195> [UID = 1]: Use deserialized engine model: ./yolo_nano.engine
0:00:00.305821541 27186 0xaaaadb1e4100 INFO                 nvinfer gstnvinfer_impl.cpp:343:notifyLoadModelStatus:<nvinfer0> [UID 1]: Load new model:./config_car_detector.config sucessfully
Opening in BLOCKING MODE 
NvMMLiteOpen : Block : BlockType = 261 
NvMMLiteBlockCreate : Block : BlockType = 261 
max_fps_dur 8.33333e+06 min_fps_dur 2e+08
^C
real	0m5.103s
user	0m0.611s
sys	0m0.356s

Thanks.

Hi,

We just confirmed that the app can terminate successfully with JetPack 6.2+Orin Nano.

$ time python3 gst_script.py 

(gst-plugin-scanner:6624): GStreamer-WARNING **: 06:29:55.507: Failed to load plugin '/usr/lib/aarch64-linux-gnu/gstreamer-1.0/deepstream/libnvdsgst_udp.so': librivermax.so.0: cannot open shared object file: No such file or directory

(gst-plugin-scanner:6624): GStreamer-WARNING **: 06:29:55.521: Failed to load plugin '/usr/lib/aarch64-linux-gnu/gstreamer-1.0/deepstream/libnvdsgst_deepstream_bins.so': libgstrtspserver-1.0.so.0: cannot open shared object file: No such file or directory

(gst-plugin-scanner:6624): GStreamer-WARNING **: 06:29:55.534: Failed to load plugin '/usr/lib/aarch64-linux-gnu/gstreamer-1.0/deepstream/libnvdsgst_inferserver.so': libtritonserver.so: cannot open shared object file: No such file or directory
(Argus) Error EndOfFile: Unexpected error in reading socket (in src/rpc/socket/client/ClientSocketManager.cpp, function recvThreadCore(), line 277)
(Argus) Error EndOfFile: Receive worker failure, notifying 1 waiting threads (in src/rpc/socket/client/ClientSocketManager.cpp, function recvThreadCore(), line 350)
(Argus) Error InvalidState: Argus client is exiting with 1 outstanding client threads (in src/rpc/socket/client/ClientSocketManager.cpp, function recvThreadCore(), line 366)
(Argus) Error EndOfFile: Receiving thread terminated with error (in src/rpc/socket/client/ClientSocketManager.cpp, function recvThreadWrapper(), line 379)
(Argus) Error EndOfFile: Client thread received an error from socket (in src/rpc/socket/client/ClientSocketManager.cpp, function send(), line 145)
(Argus) Error EndOfFile:  (propagating from src/rpc/socket/client/SocketClientDispatch.cpp, function dispatch(), line 92)
GLib (gthread-posix.c): Unexpected error from C library during 'pthread_setspecific': Invalid argument.  Aborting.
max_fps_dur 8.33333e+06 min_fps_dur 2e+08
gstnvtracker: Loading low-level lib at /opt/nvidia/deepstream/deepstream/lib/libnvds_nvmultiobjecttracker.so
[NvMultiObjectTracker] Initialized
Setting min object dimensions as 16x16 instead of 1x1 to support VIC compute mode.
WARNING: [TRT]: Using an engine plan file across different models of devices is not recommended and is likely to affect performance or even cause errors.
0:00:02.877897549  6623 0xaaaaea317b00 INFO                 nvinfer gstnvinfer.cpp:684:gst_nvinfer_logger:<nvinfer0> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:2092> [UID = 1]: deserialized trt engine from :./yolo_nano.engine
Implicit layer support has been deprecated
INFO: [Implicit Engine Info]: layers num: 0

0:00:02.877986576  6623 0xaaaaea317b00 INFO                 nvinfer gstnvinfer.cpp:684:gst_nvinfer_logger:<nvinfer0> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2195> [UID = 1]: Use deserialized engine model: ./yolo_nano.engine
0:00:02.889870627  6623 0xaaaaea317b00 INFO                 nvinfer gstnvinfer_impl.cpp:343:notifyLoadModelStatus:<nvinfer0> [UID 1]: Load new model:./config_car_detector.config sucessfully
Opening in BLOCKING MODE 
NvMMLiteOpen : Block : BlockType = 261 
NvMMLiteBlockCreate : Block : BlockType = 261 
max_fps_dur 8.33333e+06 min_fps_dur 2e+08
^C
real	0m7.904s
user	0m1.156s
sys	0m0.918s

Thanks.

Thank you for the confirmation. Is any of the experiments above done and successful in JetPack 6.0?

Hi,
Since it is fixed in later Jetpack release, we would suggest upgrade your system.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.