Custom Deepstream App Segmentation Fault with nvarguscamerasrc Camera Disconnect

• Hardware Platform (Jetson / GPU)
Jetson Xavier NX
• DeepStream Version
6.0
• JetPack Version (valid for Jetson only)
4.6
• TensorRT Version
8.0.1.6
• Issue Type( questions, new requirements, bugs)
Questions

I have a custom Deepstream application written in python that I am trying to get to work. The application sets up two separate pipelines, one with a constant still image, and one with an nvarguscamerasrc and deepstream plugins, and switches between the two upon the press of a button. I am trying to get the application to automatically switch to the still image pipeline upon disconnect of the camera, and then have the ability to reinitialize or set up the camera pipeline once the camera is connected back. Switching to the still image pipeline works, but any attempt to set up the camera pipeline again results in a segmentation fault. Below is my full application, and below that is the part of the code that I’m trying to get to work:

import gi
gi.require_version('Gst', '1.0')
from gi.repository import Gst, GObject, GLib
import threading
import time
import os


class CameraApp:
    def __init__(self):
        self.active_pipeline = None
        self.inactive_pipeline = None

        self.loop = None
        self.bus = None # Active bus

        self.glitch = False
        self.CamRunning = False

    def _bus_call(self, bus, message):
        t = message.type
        if t == Gst.MessageType.EOS:
            print("End-of-stream\n")
            self.stop()
        elif t == Gst.MessageType.WARNING:
            err, debug = message.parse_warning()
            print("Warning: %s: %s\n" % (err, debug))
        elif t == Gst.MessageType.ERROR:
            err, debug = message.parse_error()
            print("Error: %s: %s\n" % (err, debug))

            if self.CamRunning: # TODO - Add camera error check specifically
                print('Switching to splash screen pipeline')
                self._switch_pipeline()

                # Up to here it works without segfaulting - if below is added it seg faults either immediately, or upon switching back to the camera pipeline

                if self.inactive_pipeline:
                    print('Setting camera pipeline to null')
                    self.inactive_pipeline.set_state(Gst.State.NULL)
                os.system("/etc/init.d/nvargus-daemon stop") # TODO - do we have to stop and restart? Maybe changing the pipeline is enough
                time.sleep(1)
                os.system("/etc/init.d/nvargus-daemon start")
                time.sleep(1)
                print('Creating new camera pipeline')
                self.inactive_pipeline = self._create_camera_pipeline()
            else:
                self.stop()
        return True


    def _switch_pipeline(self):
        with threading.Lock():
            print('Switched pipelines.')
            self.active_pipeline.set_state(Gst.State.PAUSED)
            self.active_pipeline, self.inactive_pipeline = self.inactive_pipeline, self.active_pipeline # Switch pipelines
            self.bus = self.active_pipeline.get_bus() # Update active bus
            self.bus.add_signal_watch()
            self.bus.connect("message", self._bus_call)
            self.active_pipeline.set_state(Gst.State.PLAYING) # Could potentially go above bus calls to get pipeline playing faster
            self.CamRunning = not self.CamRunning # Switch state (assumes splash screen pipeline is the first pipeline started)


    def _handle_keyboard_input(self, source, condition):
        # Check if the condition is triggered by keyboard input
        if condition & GLib.IO_IN:
            # Get the key pressed
            key = source.readline().strip()
            print(f"Key pressed: {key}")

            # Press s to switch between pipelines
            if key == 's':
                self._switch_pipeline()


        return True


    def _create_splash_pipeline(self):
        pipeline_string = "filesrc location=/usr/bin/Splash_Screen.png ! decodebin ! videoconvert ! imagefreeze ! nvvidconv ! nvdrmvideosink conn-id=0 sync=false"
        return Gst.parse_launch(pipeline_string)

    def _create_camera_pipeline(self):
        pipeline_string = """
            nvarguscamerasrc ee-mode=1 ee-strength=.1 tnr-mode=1 tnr-strength=0.1 wbmode=fluorescent ispdigitalgainrange='1 1' bufapi-version=1 
                ! video/x-raw(memory:NVMM),width=1920,height=1080,format=NV12 ! tee name=t1
            t1. ! queue ! nvvideoconvert ! video/x-raw,width=240,height=136,format=GRAY8 ! exposure
            t1. ! queue ! nvvideoconvert ! video/x-raw,width=1920,height=1080,format=RGBA ! zoom ! nvvidconv ! nvdrmvideosink conn-id=0 sync=false
            t1. ! queue ! identity drop-probability=0 ! nvvideoconvert ! mux.sink_0 nvstreammux name=mux batch_size=1 width=1920 height=1080 ! nvvideoconvert ! tee name=t2
                t2. ! video/x-raw(memory:NVMM),width=1280,height=720 ! nvinfer config-file-path=/home/root/Software/DeepstreamWork/vi_deepstream/nvinfer_configs/vi_jetson_pgie_invivo_config.txt ! viexample ! fakesink
        """
        return Gst.parse_launch(pipeline_string)



    def start(self):
        '''Start the Gstreamer application'''

        # Initialize Gstreamer and other pieces
        Gst.init(None)

        # Set the loop
        self.loop = GLib.MainLoop()

        # Add keyboard input watch
        keyboard_fd = 0  # Use 0 for stdin
        self.keyboard_handler_id = GLib.io_add_watch(GLib.IOChannel(keyboard_fd), GLib.IO_IN, self._handle_keyboard_input)

        # Create separate camera and splash screen pipelines - set splash screen pipeline as the active one
        self.active_pipeline = self._create_splash_pipeline()
        if not self.active_pipeline:
            print("Unable to create splash pipeline\n")
            return

        self.inactive_pipeline = self._create_camera_pipeline()
        if not self.inactive_pipeline:
            print("Unable to create camera pipeline\n")
            return

        # Add message bus to keep an eye out for incoming messages
        self.bus = self.active_pipeline.get_bus()
        self.bus.add_signal_watch()
        self.bus.connect("message", self._bus_call)

        # Start playing the active pipeline
        self.inactive_pipeline.set_state(Gst.State.NULL)
        self.active_pipeline.set_state(Gst.State.PLAYING)

        try:
            self.loop.run()
        except BaseException:
            pass

    def stop(self):
        print("Stopping camera application.")
        os.system("/etc/init.d/nvargus-daemon stop")
        if self.active_pipeline:
            self.active_pipeline.set_state(Gst.State.NULL)
        if self.inactive_pipeline:
            self.inactive_pipeline.set_state(Gst.State.NULL)
        if self.loop:
            self.loop.quit()
        # os.system("/etc/init.d/nvargus-daemon stop") # TODO - before or after setting other stuff to NULL?


def main():

    # Initialize system
    os.system("stty -F /dev/ttyTHS0 115200")
    os.system("echo 1 > /sys/class/graphics/fb1/blank")
    os.system("echo 1 > /sys/class/graphics/fb0/blank")
    os.system("nvpmodel -m 2")
    time.sleep(2)
    os.system("/etc/init.d/nvargus-daemon stop") # Restart the camera daemon in case it's running
    time.sleep(1)
    os.system("/etc/init.d/nvargus-daemon start")
    time.sleep(1)

    # Initialize the Gstreamer camera application
    app = CameraApp()

    # Start the pipeline
    app.start()


if __name__ == "__main__":
    main()

Here is the specific code block that I’m trying to get to work:

        elif t == Gst.MessageType.ERROR:
            err, debug = message.parse_error()
            print("Error: %s: %s\n" % (err, debug))

            if self.CamRunning: # TODO - Add camera error check specifically
                print('Switching to splash screen pipeline')
                self._switch_pipeline()

                # Up to here it works without segfaulting - if below is added it seg faults either immediately, or upon switching back to the camera pipeline

                if self.inactive_pipeline:
                    print('Setting camera pipeline to null')
                    self.inactive_pipeline.set_state(Gst.State.NULL)
                os.system("/etc/init.d/nvargus-daemon stop") # TODO - do we have to stop and restart? Maybe changing the pipeline is enough
                time.sleep(1)
                os.system("/etc/init.d/nvargus-daemon start")
                time.sleep(1)
                print('Creating new camera pipeline')
                self.inactive_pipeline = self._create_camera_pipeline()
            else:
                self.stop()

Basically the idea is upon camera disconnect, an error is sent to the bus, at which point the pipeline is switched to the still image pipeline. From there I make attempts to set the camera pipeline to NULL and reinitialize it, stop/start the nvargus camera daemon, etc. No permutation that I’ve tried of different lines or orders between the different lines has worked so far. I keep getting segmentation faults with errors like:

Error: NvArgusCameraSrc: DISCONNECTED (8): Argus Error Status

Switching to splash screen pipeline
Switched pipelines.
Setting camera pipeline to null
(Argus) Error EndOfFile: Unexpected error in reading socket (in src/rpc/socket/client/ClientSocketManager.cpp, function recvThreadCore(), line 266)
(Argus) Error EndOfFile: Receiving thread terminated with error (in src/rpc/socket/client/ClientSocketManager.cpp, function recvThreadWrapper(), line 368)
(Argus) Error InvalidState: Receive thread is not running cannot send. (in src/rpc/socket/client/ClientSocketManager.cpp, function send(), line 96)
(Argus) Error InvalidState:  (propagating from src/rpc/socket/client/SocketClientDispatch.cpp, function dispatch(), line 91)
(Argus) Error InvalidState: Receive thread is not running cannot send. (in src/rpc/socket/client/ClientSocketManager.cpp, function send(), line 96)
(Argus) Error InvalidState:  (propagating from src/rpc/socket/client/SocketClientDispatch.cpp, function dispatch(), line 91)
Segmentation fault

The error at the top with the disconnect is acceptable since the bus sees it and can implement logic to try and continue working. Obviously the Argus errors about the socket being in some bad state is what’s causing the crash, and I’m not sure how to resolve properly.

Can what I’m trying to do be done? I would think with the correct manipulation of pipeline elements I should be able to get it to work, and I think the general idea of setting the pipeline to NULL and reinitializing should also work, but have not had success.

Thank you for the help, or even just for reading this far.

Perhaps a newer version of the nvargus module has fixed this seg fault? Does anyone with knowledge of this codebase have any thoughts on that matter?

Do you run two camera?
If yes, stop nvargus-daemon would hurt the running camera.

Just one camera, that can be disconnected or reconnected (or replaced with a different camera). Stopping/starting the daemon doesn’t seem to have much affect, just setting the pipeline to NULL at any point causes the seg fault, so something isn’t being deleted properly in the nvarguscamerasrc element. It’s possible this is fixed in a more recent deepstream version, though I’m not in a great position to test that at the moment.

A colleague of mine also mentioned it’s possible that our camera driver isn’t being deleted properly by the element, but I wasn’t sure what to think of that, or whether it was relevant. Perhaps someone with more intimate knowledge of the nvarguscamera source code might have thoughts on that.

You can download the source code from the download center to debug it.

Thanks

Do you mean the source code for the nvarguscamera stuff, or a higher Jetpack/Deepstream version?

The source of nvarguscamerasrc.