Nvargus-daemon hangs after restarting gstreamer pipeline (L4T 32.2.1)

I am running nvargus-daemon and gstreamer pipelines inside a container. I am testing the resiliency of the software to restarts, and it seems to hang once every few restarts.

To start nvargus-daemon
nvargus-daemon

To start sample gstreamer pipeline
gst-launch-1.0 -v -e --gst-debug-level=4 nvarguscamerasrc sensor_id=1 name=cam_front maxperf=true aelock=false awblock=false wbmode=1 ! 'video/x-raw(memory:NVMM), width=(int)1936, height=(int)1100, format=(string)NV12, framerate=(fraction)30/1' ! fakesink silent=false -v &> gstreamer.log

I’m stopping the gstreamer pipeline by sending a SIGINT.
kill -SIGINT PID

I’m stopping nvargus-daemon by issuing a SIGTERM followed by a SIGKILL.

The output from gstreamer console output when it is shutting down, and following which nvargus-daemon hangs, is

    === gst-launch-1.0[676]: CameraProvider initialized (0x7f78a949f0)
    SCF: Error BadValue: NvPHSSendThroughputHints (in src/common/CameraPowerHint.cpp, function 
    sendCameraPowerHint(), line 56)
    LSC: LSC surface is not based on full res! 
    === gst-launch-1.0[676]: Connection closed (7F7D4DE1D0) 
    === gst-launch-1.0[676]: WARNING: CameraProvider was not destroyed before client connection 
    terminated.
    === gst-launch-1.0[676]:          The client may have abnormally terminated. Destroying CameraProvider...
    === gst-launch-1.0[676]: CameraProvider destroyed (0x7f78a949f0)
    === gst-launch-1.0[676]: WARNING: Cleaning up 1 outstanding requests...
    === gst-launch-1.0[676]: WARNING: Cleaning up 1 outstanding streams...
    SCF: Error InvalidState: 2 buffers still pending during EGLStreamProducer destruction (propagating from 
    src/services/gl/EGLStreamProducer.cpp, function freeBuffers(), line 305

In the normal working case, gstreamer console output when it is opened and closed is

    === gst-launch-1.0[610]: CameraProvider initialized (0x7f98a94c90)
    SCF: Error BadValue: NvPHSSendThroughputHints (in src/common/CameraPowerHint.cpp, function 
    sendCameraPowerHint(), line 56)
    LSC: LSC surface is not based on full res!
    === gst-launch-1.0[610]: CameraProvider destroyed (0x7f98a94c90)
    === gst-launch-1.0[610]: Connection closed (7F9FA941D0)
    === gst-launch-1.0[610]: Connection cleaned up (7F9FA941D0)

Hi,
We would suggest do system upgrade. If you are not able to upgrade to r32.3.1 and have to stay in r32.2, We would suggest at least upgrade to r32.2.3.

For staying on r32.2.1, you may try the patches in
https://elinux.org/L4T_Jetson/r32.2.1_patch
[VI][Xavier] fix the known issue of multiple cameras stream-off process.
[VI][Xavier] VI stops capturing on multistream configuration

Hi,
Thanks for the reply. I’m seeing this on 32.2.3 as well.

Hi,
Is the issue present if you exit the pipeline through Ctrl+C? Or set num-buffers to nvarguscamerasrc. The two methods are more general in exiting gst-launch-1.0.

Hi, yes, I issue a SIGINT to the pipelines, which should be the same as Ctrl+C.

I have noticed better results by shutting down the pipelines one at a time with about ten seconds delay/sleep in between, and finally issuing a SIGTERM to nvargus-daemon. The pipelines seem to shut down cleanly but this isn’t a great solution since it relies on hard coded delays.

hello SanjayD,

due to force shutdown the Argus service, it’ll need internal dequeue process to empty buffers.
may I know would you like to restart camera application immediately?

BTW, please also refer to Argus samples to terminate capture pipeline gracefully.
for example,
Argus/samples/userAutoExposure/main.cpp

    iSession->stopRepeat();
    iSession->waitForIdle();

    // Destroy the output streams (stops consumer threads).
    stream.reset();

    // Wait for the consumer threads to complete.
    PROPAGATE_ERROR(previewConsumerThread.shutdown());

    // Shut down Argus.
    cameraProvider.reset();

Hi thanks for the reply. Yes, I need to use the camera application immediately after restarting the pipelines. As mentioned in my previous reply,
I am now able to restart much more reliably by making use of delays while shutting down the pipelines launched using gst-launch-1.0.

Thanks for the pointer to the cpp example code, it will help if I move away from using the gst-launch app.