NVIDIA gstreamer element segfaults with `Gst.debug_add_log_function` race condition

pwoolvett · September 30, 2020, 9:49pm

The following two pipelines achieve the same (but the second uses nvidia gst elements)

NO_NV="""
    filesrc location=/opt/nvidia/deepstream/deepstream/samples/streams/sample_720p.mp4
    ! qtdemux
    ! h264parse
    ! avdec_h264
    ! xvimagesink
"""

NV="""
    filesrc location=/opt/nvidia/deepstream/deepstream/samples/streams/sample_720p.mp4
    ! qtdemux
    ! h264parse
    ! nvv4l2decoder
    ! nvvideoconvert
    ! xvimagesink
"""

I’m runnning gst-launch from python’s subprocess, and they work fine.
They also work fine when calling them from python, using Gst.parse_launch.

I’m trying to attach a custom logger using Gst.debug_add_log_function (docs here), but the NV pipe produces a segfault when doing this. With GST_DEBUG=2:

Opening in BLOCKING MODE 
0:00:00.093462508 27987      0x554b750 WARN                    v4l2 gstv4l2object.c:4430:gst_v4l2_object_probe_caps:<nvv4l2decoder0:src> Failed to probe pixel aspect ratio with VIDIOC_CROPCAP: Unknown error -1
0:00:00.093555696 27987      0x554b750 WARN                    v4l2 gstv4l2object.c:2372:gst_v4l2_object_add_interlace_mode:0x54e1090 Failed to determine interlace mode
0:00:00.094160457 27987      0x554b750 WARN                 basesrc gstbasesrc.c:3583:gst_base_src_start_complete:<filesrc0> pad not activated yet
0:00:00.094888999 27987      0x55500f0 WARN                 qtdemux qtdemux_types.c:233:qtdemux_type_get: unknown QuickTime node type sgpd
0:00:00.094950666 27987      0x55500f0 WARN                 qtdemux qtdemux_types.c:233:qtdemux_type_get: unknown QuickTime node type sbgp
0:00:00.095009644 27987      0x55500f0 WARN                 qtdemux qtdemux.c:3031:qtdemux_parse_trex:<qtdemux0> failed to find fragment defaults for stream 1
0:00:00.095199828 27987      0x55500f0 WARN                 qtdemux qtdemux.c:3031:qtdemux_parse_trex:<qtdemux0> failed to find fragment defaults for stream 2
NvMMLiteOpen : Block : BlockType = 261 
NVMEDIA: Reading vendor.tegra.display-size : status: 6 
NvMMLiteBlockCreate : Block : BlockType = 261 


{'user_data': (), 'message': <Gst.DebugMessage object at 0x7f950a9e58 (void at 0x7f89d17828)>, 'source': <Gst.Pad object at 0x7f950ac0d8 (GstPad at 0x545ad30)>, 'dline': 4430, 'dfctn': 'gst_v4l2_object_probe_caps', 'dfile': '/dvs/git/dirty/git-master_linux/3rdparty/gst/gst-v4l2/gst-v4l2/gstv4l2object.c', 'level': <enum GST_LEVEL_WARNING of type Gst.DebugLevel>, 'category': <Gst.DebugCategory object at 0x7f9509fc78 (void at 0x522fba0)>}

0:00:00.206069582 27987      0x55500f0 WARN                    v4l2 gstv4l2object.c:4430:gst_v4l2_object_probe_caps:<nvv4l2decoder0:src> Failed to probe pixel aspect ratio with VIDIOC_CROPCAP: Unknown error -1
sys:1: Warning: g_object_is_floating: assertion 'G_IS_OBJECT (object)' failed
sys:1: Warning: g_object_get_qdata: assertion 'G_IS_OBJECT (object)' failed
Segmentation fault (core dumped)

I need to monitor these debug logs to parse some messages which are not posted to the bus (which would be parseable inside bus_call, see code below).
The crash only happens when using the NV pipe, which uses nvidias gstreamer elements.
Note that the callback gets called at least once, before the crash.
GST_DEBUG<=1 obviously avoids the segfault, but also kills my custom logger.
The behavior of the log function used does not matter (a noop or a lambda also produces the segfault)

To avoid it I can

sleep > 200 [msec] after the pipeline state is set to Gst.State.PLAYING. Tthe limit seems to be around 108 [ms] for a system without external load: when running the same command over and over again, sleeping this limit ammmount of time, it will sometines crash. Sleeping more than this “guarantees” the crash won’t happen.
Add log from a callback, eg using a GObject.timeout to run it after eg 1 sec (running it with timeout=0 wil sometimes also produce the crash). This skips the the first message which would be received if the log was added in the main thread and before settings the pipeline’s state to Gst.State.PLAYING
In both cases the log does get called as expected for subsequent result (I loose whatever would be messaged before the delay though)

Questions

Is this expected behavior? Why does this happen only when using nvidia’s elements?
Is the current workaround robust (Option 2)? Are there any alternatives to this “solution”?

Code

#!/usr/bin/env python3

import shlex
import subprocess
import sys

import gi
from gi.repository import GLib
from gi.repository import GObject
gi.require_version("Gst", "1.0")
from gi.repository import Gst

NO_NV="""
    filesrc location=/opt/nvidia/deepstream/deepstream/samples/streams/sample_720p.mp4
    ! qtdemux
    ! h264parse
    ! avdec_h264
    ! xvimagesink
"""

NV="""
    filesrc location=/opt/nvidia/deepstream/deepstream/samples/streams/sample_720p.mp4
    ! qtdemux
    ! h264parse
    ! nvv4l2decoder
    ! nvvideoconvert
    ! xvimagesink
"""

def on_message(category, level, dfile, dfctn, dline, source, message, *user_data):
    print(f"\n\n{locals()}\n")
    return True

def bus_call(bus, message, loop):
    t = message.type
    if t == Gst.MessageType.EOS:
        sys.stdout.write("End-of-stream\n")
        loop.quit()
    elif t == Gst.MessageType.ERROR:
        err, debug = message.parse_error()
        sys.stderr.write("Error: %s: %s\n" % (err, debug))
        loop.quit()
    return True

def add_log():
    Gst.debug_add_log_function(on_message)
    return True

def python(pipe, mode):

    Gst.init(None)
    GObject.threads_init()
    mainloop = GObject.MainLoop()

    pipeline = Gst.parse_launch(pipe)
    bus = pipeline.get_bus()
    bus.add_signal_watch()
    bus.connect ("message", bus_call, mainloop)
    pipeline.set_state(Gst.State.PLAYING)
    if mode=="main":
        add_log()
    elif mode=="sleep":
        from time import sleep
        sleep(0.2)
        add_log()
    elif mode=="callback":
        GObject.timeout_add_seconds(1, add_log)
    else:
        print("Not adding custom log function")

    mainloop.run()

def gstlaunch(pipe):
    return subprocess.check_call(shlex.split(f"gst-launch-1.0 {pipe}"))


if __name__ == "__main__":
    # gstlaunch(NO_NV)  # This runs ok
    # gstlaunch(NV)  # This runs ok
    # python(NO_NV, "nolog")  # This runs ok
    # python(NV, "nolog") # This runs ok
    # python(NO_NV, "main") # This runs ok
    python(NV, "main")  # This crashes
    # python(NV, "sleep") # This runs OK
    # python(NV, "callback") # This runs OK

HW

• Hardware Platform (Jetson / GPU) Jetson Xavier
• DeepStream Version 5
• JetPack Version (valid for Jetson only) 4.4
• TensorRT Version 7.1

Fiona.Chen · October 10, 2020, 5:27am

I’ve tried your python script with T4 and Jetson NX board, the script works well.

Please provide complete information as applicable to your setup.

• Hardware Platform
• DeepStream Version
• JetPack Version
**• TensorRT Version **
• Issue Type(bugs)
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)

BTW, it is better to add “pipeline.set_state(Gst.State.NULL)” after “mainloop.run()”.

pwoolvett · October 10, 2020, 12:26pm

Tried same script, now using another Xavier (specs below) and it works ok. Im goint to check back on monday with the original device.

NVIDIA Jetson AGX Xavier [16GB]
- Jetpack 4.4 [L4T 32.4.3]
- CUDA: 10.2.89
- cuDNN: 8.0.0.180
- TensorRT: 7.1.3.0

pwoolvett · October 13, 2020, 8:46pm

Tried with new Jetson, code checks out (only difference is a full system reboot).

If it appears again, I’ll comment here…

Topic		Replies	Views
Deepstream h265parser segfaults with gstreamer gst_debug_add_log_function DeepStream SDK	2	487	October 17, 2023
Redirect output from NV Gst elements DeepStream SDK	7	1186	October 12, 2021
Issues using the nvdemux from custom gstreamer element General Topics and Other SDKs camera , cuda , jetson-inference , gstreamer	1	516	August 4, 2021
Issues using the nvdemux from custom gstreamer element DeepStream SDK camera , cuda , jetson-inference , gstreamer	5	1359	October 9, 2021
SIGSEGV in libv4l2_nvargus.so DeepStream SDK	15	673	February 22, 2022
How to debug custom plugin? DeepStream SDK	2	785	October 12, 2021
Turn off debugging output from NVIDIA GStreamer elements? Jetson Xavier NX gstreamer	6	926	April 21, 2023
Deepstream Segmentation faults and memory corruption with nvv4l2decoder DeepStream SDK	7	569	January 3, 2023
Segmentation fault using nv_obj_enc + nvmultiurisrcbin + glog DeepStream SDK deepstream	7	160	December 5, 2024
Disable nvvideoconvert logging DeepStream SDK	5	121	September 23, 2024

NVIDIA gstreamer element segfaults with `Gst.debug_add_log_function` race condition

Questions

Code

HW

Related topics