Nvarguscamerasrc crashes after a few hours of working

Hello,
I have my AGX Orin running a few gstreamer pipelines to capture frames from 8 cameras. The pipeline itself is quite complex:

nvarguscamerasrc name=nvarguscamerasrc_left sensor-id=$SENSOR_ID !
      video/x-raw(memory:NVMM),width=1920,height=1200,framerate=30/1,format=NV12 !
      comp.sink_0
      nvarguscamerasrc name=nvarguscamerasrc_right sensor-id=$SENSOR_ID2 !
      video/x-raw(memory:NVMM),width=1920,height=1200,framerate=30/1,format=NV12 !
      comp.sink_1 nvcompositor name=comp background=0
      sink_0::xpos=0 sink_0::ypos=0 sink_0::width=1920 sink_0::height=1200 sink_0::sizing-policy=1
      sink_1::xpos=1920 sink_1::ypos=0 sink_1::width=1920 sink_1::height=1200 sink_1::sizing-policy=1 !
      video/x-raw(memory:NVMM),width=3840,height=1200,framerate=30/1 ! tee name=timg0 !
      queue !
      vpiremaptransform pass=false mapfile=$REMAP_FILE backend=2 !
      video/x-raw(memory:NVMM),width=3840,height=1920,format=NV12,framerate=30/1 !
      nvv4l2h264enc name=stereoh264 preset-level=UltraFastPreset maxperf-enable=true poc-type=2 iframeinterval=5 insert-sps-pps=true control-rate=1 bitrate=15000000 !
      h264parse !
      video/x-h264,stream-format=byte-stream !
      somekindofsink
      timg0. !
      queue !
      nvvidconv bl-output=false ! video/x-raw(memory:NVMM),width=1920,height=600,format=NV12 !
      videorate name=videorate_all drop-only=true ! video/x-raw(memory:NVMM),framerate=15/1 ! tee name=uncomp0 !
      nvjpegenc name=jpeg0 !
      tee name=tcp0 ! 
      queue !
      somekindofsink2
      uncomp0. !
      nvvidconv !
      video/x-raw(memory:NVMM),format=RGBA !
      queue !
      somekindofsink3

The problem is after some amount of time (from 2 to 9 hours) of working it crashes with this nvargus-daemon log:

Logs
SCF: Error InvalidState:  Corr Error 8 Received for sensor 23 .. Continuing!
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 654)
SCF: Error ResourceAlreadyInUse:  (propagating from src/services/capture/FusaCaptureViCsiHw.cpp, function startCaptureInternal(), line 877)
SCF: Error ResourceAlreadyInUse:  (propagating from src/services/capture/CaptureRecord.cpp, function doCSItoMemCapture(), line 547)
SCF: Error ResourceAlreadyInUse:  (propagating from src/services/capture/CaptureRecord.cpp, function issueCapture(), line 490)
SCF: Error ResourceAlreadyInUse:  (propagating from src/services/capture/CaptureServiceDevice.cpp, function issueCaptures(), line 1565)
SCF: Error ResourceAlreadyInUse:  (propagating from src/services/capture/CaptureServiceDevice.cpp, function issueCaptures(), line 1394)
SCF: Error ResourceAlreadyInUse:  (propagating from src/common/Utils.cpp, function workerThread(), line 125)
SCF: Error ResourceAlreadyInUse: Worker thread CaptureScheduler frameStart failed (in src/common/Utils.cpp, function workerThread(), line 144)
SCF: Error Timeout:  (propagating from src/api/Buffer.cpp, function waitForUnlock(), line 644)
SCF: Error Timeout:  (propagating from src/components/CaptureContainerImpl.cpp, function returnBuffer(), line 430)
SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 1039)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 413)
SCF: Error InvalidState:  (propagating from src/components/stages/MemoryToISPCaptureStage.cpp, function doHandleRequest(), line 148)
SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)
SCF: Error InvalidState: Sending critical error event for Session 22
 (in src/api/Session.cpp, function sendErrorEvent(), line 1039)
SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 1039)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 413)
SCF: Error InvalidState:  (propagating from src/components/stages/SensorCaptureStage.cpp, function doHandleRequest(), line 91)
SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)
SCF: Error InvalidState: Sending critical error event for Session 0
 (in src/api/Session.cpp, function sendErrorEvent(), line 1039)
SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 1039)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 413)
SCF: Error InvalidState:  (propagating from src/components/stages/MemoryToISPCaptureStage.cpp, function doHandleRequest(), line 148)
SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)
SCF: Error InvalidState: Sending critical error event for Session 1
 (in src/api/Session.cpp, function sendErrorEvent(), line 1039)
SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 1039)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 413)
SCF: Error InvalidState:  (propagating from src/components/stages/MemoryToISPCaptureStage.cpp, function doHandleRequest(), line 148)
SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)
SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 1039)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 413)
SCF: Error InvalidState:  (propagating from src/components/stages/MemoryToISPCaptureStage.cpp, function doHandleRequest(), line 148)
SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)
SCF: Error InvalidState: Sending critical error event for Session 15
 (in src/api/Session.cpp, function sendErrorEvent(), line 1039)
SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 1039)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 413)
SCF: Error InvalidState:  (propagating from src/components/stages/SensorCaptureStage.cpp, function doHandleRequest(), line 91)
SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)
SCF: Error InvalidState: Sending critical error event for Session 16
 (in src/api/Session.cpp, function sendErrorEvent(), line 1039)
SCF: Error InvalidState: Session has suffered a critical failure (in src/api/Session.cpp, function capture(), line 734)
(Argus) Error InvalidState:  (propagating from src/api/ScfCaptureThread.cpp, function run(), line 110)
SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 1039)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 413)
SCF: Error InvalidState:  (propagating from src/components/stages/SensorCaptureStage.cpp, function doHandleRequest(), line 91)
SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)
SCF: Error InvalidState: Sending critical error event for Session 14
 (in src/api/Session.cpp, function sendErrorEvent(), line 1039)
SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 1039)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 413)
SCF: Error InvalidState:  (propagating from src/components/stages/SensorCaptureStage.cpp, function doHandleRequest(), line 91)
SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)
SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 1039)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 413)
SCF: Error InvalidState:  (propagating from src/components/stages/MemoryToISPCaptureStage.cpp, function doHandleRequest(), line 148)
SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)
SCF: Error InvalidState: Sending critical error event for Session 13
 (in src/api/Session.cpp, function sendErrorEvent(), line 1039)

I use Yocto project as the AGX Orin OS based on meta-tegra layer
I haven’t tried to reproduce this on the Jetpack yet but will try asap. Note that there are no significant memory leaks in the system according to my htop.
Could you please help me to figure out what is going on here and any possible ways to fix it?

What’s the version?

cat /etc/nv_tegra_release
 âš™ nvidia@tegra-ubuntu î‚° /mnt/nvme/ws_gst/src/ros2-gst-bridge î‚° î‚  develop î‚° cat /etc/nv_tegra_release
# R36 (release), REVISION: 4.3, GCID: 38968081, BOARD: generic, EABI: aarch64, DATE: Wed Jan  8 01:49:37 UTC 2025
# KERNEL_VARIANT: oot
TARGET_USERSPACE_LIB_DIR=nvidia
TARGET_USERSPACE_LIB_DIR_PATH=usr/lib/aarch64-linux-gnu/nvidia

Btw, I verified the same setup on Jetpack 6.2 and could not catch a crash hence it is probably Yocto issue

Apply below patches to verify.

[Argus] Long run stability issue.
https://forums.developer.nvidia.com/t/318999/5

[gpu/host1x-fence] Significant kernel memory leak when using Argus camera
https://forums.developer.nvidia.com/t/325399/4/

Thanks for your reply..
I updated my argus*.so files and patched my kernel. It actually helps. there are no memory leaks anymore. Here is a plot of three days running my pipeline:


However, after three days, or to be more precise after 78:50:49 I got this nvargus error:

Logs
SCF: Error InvalidState:  Corr Error 8 Received for sensor 14 .. Continuing!
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 654)
SCF: Error ResourceAlreadyInUse:  (propagating from src/services/capture/FusaCaptureViCsiHw.cpp, function startCaptureInternal(), line 877)
SCF: Error ResourceAlreadyInUse:  (propagating from src/services/capture/CaptureRecord.cpp, function doCSItoMemCapture(), line 547)
SCF: Error ResourceAlreadyInUse:  (propagating from src/services/capture/CaptureRecord.cpp, function issueCapture(), line 490)
SCF: Error ResourceAlreadyInUse:  (propagating from src/services/capture/CaptureServiceDevice.cpp, function issueCaptures(), line 1565)
SCF: Error ResourceAlreadyInUse:  (propagating from src/services/capture/CaptureServiceDevice.cpp, function issueCaptures(), line 1394)
SCF: Error ResourceAlreadyInUse:  (propagating from src/common/Utils.cpp, function workerThread(), line 125)
SCF: Error ResourceAlreadyInUse: Worker thread CaptureScheduler frameStart failed (in src/common/Utils.cpp, function workerThread(), line 144)
SCF: Error Timeout:  (propagating from src/api/Buffer.cpp, function waitForUnlock(), line 644)
SCF: Error Timeout:  (propagating from src/components/CaptureContainerImpl.cpp, function returnBuffer(), line 430)
SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 1039)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 413)
SCF: Error InvalidState:  (propagating from src/components/stages/MemoryToISPCaptureStage.cpp, function doHandleRequest(), line 148)
SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)
SCF: Error InvalidState: Sending critical error event for Session 23
 (in src/api/Session.cpp, function sendErrorEvent(), line 1039)
SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 1039)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 413)
SCF: Error InvalidState:  (propagating from src/components/stages/MemoryToISPCaptureStage.cpp, function doHandleRequest(), line 148)
SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)
SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 1039)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 413)
SCF: Error InvalidState:  (propagating from src/components/stages/MemoryToISPCaptureStage.cpp, function doHandleRequest(), line 148)
SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)
SCF: Error InvalidState: Sending critical error event for Session 22
 (in src/api/Session.cpp, function sendErrorEvent(), line 1039)
SCF: Error InvalidState: Sending critical error event for Session 1
 (in src/api/Session.cpp, function sendErrorEvent(), line 1039)
SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 1039)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 413)
SCF: Error InvalidState:  (propagating from src/components/stages/MemoryToISPCaptureStage.cpp, function doHandleRequest(), line 148)
SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)
SCF: Error InvalidState: Sending critical error event for Session 0
 (in src/api/Session.cpp, function sendErrorEvent(), line 1039)
SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 1039)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 413)
SCF: Error InvalidState:  (propagating from src/components/stages/MemoryToISPCaptureStage.cpp, function doHandleRequest(), line 148)
SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)
SCF: Error InvalidState: Sending critical error event for Session 15
 (in src/api/Session.cpp, function sendErrorEvent(), line 1039)
SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 1039)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 413)
SCF: Error InvalidState:  (propagating from src/components/stages/MemoryToISPCaptureStage.cpp, function doHandleRequest(), line 148)
SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)
SCF: Error InvalidState: Sending critical error event for Session 13
 (in src/api/Session.cpp, function sendErrorEvent(), line 1039)
SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 1039)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 413)
SCF: Error InvalidState:  (propagating from src/components/stages/SensorCaptureStage.cpp, function doHandleRequest(), line 91)
SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)
SCF: Error InvalidState: Sending critical error event for Session 16
 (in src/api/Session.cpp, function sendErrorEvent(), line 1039)
SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 1039)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 413)
SCF: Error InvalidState:  (propagating from src/components/stages/SensorCaptureStage.cpp, function doHandleRequest(), line 91)
SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)
=== gst_pipeline_video_node[289]: Connection closed (FFFF99D1B8E0)=== gst_pipeline_video_node[289]: WARNING: CameraProvider was not destroyed before client connection terminated.=== gst_pipeline_video_node[289]:          The client may have abnormally terminated. Destroying CameraProvider...=== gst_pipeline_video_node[289]: CameraProvider destroyed (0xffff95760b80)=== gst_pipeline_video_node[289]: WARNING: Cleaning up 2 outstanding requests...=== gst_pipeline_video_node[289]: WARNING: Cleaning up 2 outstanding stream settings...=== gst_pipeline_video_node[289]: WARNING: Cleaning up 2 outstanding queues...=== gst_pipeline_video_node[289]: WARNING: Cleaning up 2 outstanding sessions...

What looks the same as I got before, but runtime is much better at least for the only try I have made.
Do you maybe have any other ideas how to make it work stable with no crashes at all?

Just to confirm I run this pipeline one more time and got a crash after 70 hours with the same argus report:

Logs
SCF: Error InvalidState:  Corr Error 8 Received for sensor 1 .. Continuing!
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 654)
Wrong frequency range!
SCF: Error ResourceAlreadyInUse:  (propagating from src/services/capture/FusaCaptureViCsiHw.cpp, function startCaptureInternal(), line 877)
SCF: Error ResourceAlreadyInUse:  (propagating from src/services/capture/CaptureRecord.cpp, function doCSItoMemCapture(), line 547)
SCF: Error ResourceAlreadyInUse:  (propagating from src/services/capture/CaptureRecord.cpp, function issueCapture(), line 490)
SCF: Error ResourceAlreadyInUse:  (propagating from src/services/capture/CaptureServiceDevice.cpp, function issueCaptures(), line 1565)
SCF: Error ResourceAlreadyInUse:  (propagating from src/services/capture/CaptureServiceDevice.cpp, function issueCaptures(), line 1394)
SCF: Error ResourceAlreadyInUse:  (propagating from src/common/Utils.cpp, function workerThread(), line 125)
SCF: Error ResourceAlreadyInUse: Worker thread CaptureScheduler frameStart failed (in src/common/Utils.cpp, function workerThread(), line 144)
SCF: Error Timeout:  (propagating from src/api/Buffer.cpp, function waitForUnlock(), line 644)
SCF: Error Timeout:  (propagating from src/components/CaptureContainerImpl.cpp, function returnBuffer(), line 430)
SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 1039)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 413)
SCF: Error InvalidState:  (propagating from src/components/stages/SensorCaptureStage.cpp, function doHandleRequest(), line 91)
SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)
SCF: Error InvalidState: Sending critical error event for Session 23 
 (in src/api/Session.cpp, function sendErrorEvent(), line 1039)
SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 1039)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 413)
SCF: Error InvalidState:  (propagating from src/components/stages/SensorCaptureStage.cpp, function doHandleRequest(), line 91)
SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)
SCF: Error InvalidState: Sending critical error event for Session 22 
 (in src/api/Session.cpp, function sendErrorEvent(), line 1039)
SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 1039)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 413)
SCF: Error InvalidState:  (propagating from src/components/stages/MemoryToISPCaptureStage.cpp, function doHandleRequest(), line 148)
SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)
SCF: Error InvalidState: Sending critical error event for Session 0 
 (in src/api/Session.cpp, function sendErrorEvent(), line 1039)
SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 1039)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 413)
SCF: Error InvalidState:  (propagating from src/components/stages/MemoryToISPCaptureStage.cpp, function doHandleRequest(), line 148)
SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)
SCF: Error InvalidState: Sending critical error event for Session 15 
 (in src/api/Session.cpp, function sendErrorEvent(), line 1039)
SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 1039)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 413)
SCF: Error InvalidState:  (propagating from src/components/stages/SensorCaptureStage.cpp, function doHandleRequest(), line 91)
SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)
SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 1039)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 413)
SCF: Error InvalidState:  (propagating from src/components/stages/MemoryToISPCaptureStage.cpp, function doHandleRequest(), line 148)
SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)
SCF: Error InvalidState: Sending critical error event for Session 13 
 (in src/api/Session.cpp, function sendErrorEvent(), line 1039)
SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 1039)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 413)
SCF: Error InvalidState:  (propagating from src/components/stages/MemoryToISPCaptureStage.cpp, function doHandleRequest(), line 148)
SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)
SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 1039)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 413)
SCF: Error InvalidState:  (propagating from src/components/stages/SensorCaptureStage.cpp, function doHandleRequest(), line 91)
SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)
SCF: Error InvalidState: Sending critical error event for Session 16 
 (in src/api/Session.cpp, function sendErrorEvent(), line 1039)
SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 1039)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 413)
SCF: Error InvalidState:  (propagating from src/components/stages/SensorCaptureStage.cpp, function doHandleRequest(), line 91)
SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)
SCF: Error InvalidState: Sending critical error event for Session 14 
 (in src/api/Session.cpp, function sendErrorEvent(), line 1039)
SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 1039)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 413)
SCF: Error InvalidState:  (propagating from src/components/stages/MemoryToISPCaptureStage.cpp, function doHandleRequest(), line 148)
SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)

Could you verify by infinite mode.

sudo service nvargus-daemon stop
sudo enableCamInfiniteTimeout=1 nvargus-daemon

Hello, enableCamInfiniteTimeout=1 has been set in all experiments

Replace the attached lib to confirm again.

libnvscf.so.36.4.inifinte_2 (8.5 MB)

Hello,

I applied the patched library and started nvargus-daemon as usual:

enableCamInfiniteTimeout=1 nvargus-daemon

However, I observe the following issues:

  • The stream crashes after about 15 hours of continuous streaming.
  • nvarguscamerasrc reports: CONSUMER: ERROR OCCURRED
  • nvargus-daemon does not report any errors, but after applying the patch I noticed a new type of log message, for example:
    assignAllBuffersFromStream if ii 0 stream 0xffff40001310 and buffer (nil) 281472879399296meout 4af0000077f

I also tried to reproduce the same experiment on a clean JetPack 6.2 installation (without any patches) and got much better results: the stream ran for 189 hours before failing with this log message:

Logs
=== gst_pipeline_video_node[338]: Connection closed (FFFEA27CB840)SCF: Error InvalidState: Timeout!! Skipping requests on sensor GUID 22, capture sequence ID = 20475841 draining session frameStart events 2
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameStart(), line 529)
SCF: Error InvalidState: Timeout!! Skipping requests on sensor GUID 0, capture sequence ID = 20476205 draining session frameStart events 2
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameStart(), line 529)
SCF: Error InvalidState: Timeout!! Skipping requests on sensor GUID 23, capture sequence ID = 20475839 draining session frameStart events 2
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameStart(), line 529)
SCF: Error InvalidState: Sensor 16 already in same state
 (in src/services/capture/CaptureServiceDeviceSensor.cpp, function setErrorState(), line 100)
(Argus) Error FileOperationFailed: Socket write failed: Broken pipe (in src/rpc/socket/common/SocketUtils.cpp, function writeSocket(), line 49)
(Argus) Error FileOperationFailed:  (propagating from libs/rpc_socket_server/ServerSocketManager.cpp, function sendResponse(), line 92)
(Argus) Error FileOperationFailed:  (propagating from libs/rpc_socket_server/ServerSocketManager.cpp, function handleRequest(), line 174)
(Argus) Error FileOperationFailed: Message processing failed (id = 532370485) (in libs/rpc_socket_server/ServerWorkerThread.cpp, function run(), line 143)
(Argus) Error FileOperationFailed: Socket write failed: Broken pipe (in src/rpc/socket/common/SocketUtils.cpp, function writeSocket(), line 49)
(Argus) Error FileOperationFailed:  (propagating from libs/rpc_socket_server/ServerSocketManager.cpp, function sendResponse(), line 92)
(Argus) Error FileOperationFailed:  (propagating from libs/rpc_socket_server/ServerSocketManager.cpp, function handleRequest(), line 174)
(Argus) Error FileOperationFailed: Message processing failed (id = 532352072) (in libs/rpc_socket_server/ServerWorkerThread.cpp, function run(), line 143)
SCF: Error InvalidState: Timeout!! Skipping requests on sensor GUID 14, capture sequence ID = 20475294 draining session frameStart events 2
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameStart(), line 529)
SCF: Error Timeout: Sending critical error event for Session 0
 (in src/api/Session.cpp, function sendErrorEvent(), line 1039)
SCF: Error Timeout: Sending critical error event for Session 14
 (in src/api/Session.cpp, function sendErrorEvent(), line 1039)
SCF: Error InvalidState: Timeout!! Skipping requests on sensor GUID 1, capture sequence ID = 20476203 draining session frameStart events 2
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameStart(), line 529)
SCF: Error InvalidState: Sensor GUID 14 is in error state. Skipping requests, capture sequence ID = 20475295 continue draining session frameStart events 1
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameStart(), line 543)
SCF: Error Timeout: Sending critical error event for Session 23
 (in src/api/Session.cpp, function sendErrorEvent(), line 1039)
SCF: Error InvalidState: Sensor GUID 23 is in error state. Skipping requests, capture sequence ID = 20475840 continue draining session frameStart events 1
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameStart(), line 543)
SCF: Error InvalidState: Sensor 22 already in same state
 (in src/services/capture/CaptureServiceDeviceSensor.cpp, function setErrorState(), line 100)
SCF: Error InvalidState: Timeout!! Skipping requests on sensor GUID 22, capture sequence ID = 20475840 draining session frameEnd events 3
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 646)
SCF: Error InvalidState: Sensor 1 already in same state
 (in src/services/capture/CaptureServiceDeviceSensor.cpp, function setErrorState(), line 100)
SCF: Error InvalidState: Timeout!! Skipping requests on sensor GUID 1, capture sequence ID = 20476202 draining session frameEnd events 3
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 646)
SCF: Error InvalidState: Sensor 13 already in same state
 (in src/services/capture/CaptureServiceDeviceSensor.cpp, function setErrorState(), line 100)
SCF: Error InvalidState: Timeout!! Skipping requests on sensor GUID 13, capture sequence ID = 10237316 draining session frameEnd events 3
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 646)
SCF: Error InvalidState: Timeout!! Skipping requests on sensor GUID 15, capture sequence ID = 10237320 draining session frameStart events 2
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameStart(), line 529)
SCF: Error InvalidState: Timeout!! Skipping requests on sensor GUID 16, capture sequence ID = 20475295 draining session frameEnd events 3
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 646)
SCF: Error InvalidState: Sensor 23 already in same state
 (in src/services/capture/CaptureServiceDeviceSensor.cpp, function setErrorState(), line 100)
(Argus) Error FileOperationFailed: Socket write failed: Broken pipe (in src/rpc/socket/common/SocketUtils.cpp, function writeSocket(), line 49)
(Argus) Error FileOperationFailed:  (propagating from libs/rpc_socket_server/ServerSocketManager.cpp, function sendResponse(), line 92)
PowerServiceCore:handleRequests: timePassed = 4363372
SCF: Error InvalidState: Sensor 14 already in same state
 (in src/services/capture/CaptureServiceDeviceSensor.cpp, function setErrorState(), line 100)
SCF: Error InvalidState: Sensor GUID 0 is in error state. Skipping requests, capture sequence ID = 20476206 continue draining session frameStart events 1
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameStart(), line 543)
SCF: Error InvalidState: Sensor 0 already in same state
 (in src/services/capture/CaptureServiceDeviceSensor.cpp, function setErrorState(), line 100)
SCF: Error InvalidState: Timeout!! Skipping requests on sensor GUID 0, capture sequence ID = 20476204 draining session frameEnd events 4
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 646)
SCF: Error InvalidState: Timeout!! Skipping requests on sensor GUID 13, capture sequence ID = 10237317 draining session frameStart events 2
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameStart(), line 529)
(Argus) Error FileOperationFailed: Socket write failed: Broken pipe (in src/rpc/socket/common/SocketUtils.cpp, function writeSocket(), line 49)
(Argus) Error FileOperationFailed:  (propagating from libs/rpc_socket_server/ServerSocketManager.cpp, function sendResponse(), line 92)
SCF: Error InvalidState: Sensor 15 already in same state
 (in src/services/capture/CaptureServiceDeviceSensor.cpp, function setErrorState(), line 100)
SCF: Error Timeout: Sending critical error event for Session 22
 (in src/api/Session.cpp, function sendErrorEvent(), line 1039)
(Argus) Error FileOperationFailed:  (propagating from libs/rpc_socket_server/ServerSocketManager.cpp, function handleRequest(), line 174)
SCF: Error InvalidState: Sensor GUID 22 is in error state. Skipping requests, capture sequence ID = 20475842 continue draining session frameStart events 2
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameStart(), line 543)
SCF: Error InvalidState: Timeout!! Skipping requests on sensor GUID 14, capture sequence ID = 20475293 draining session frameEnd events 3
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 646)
(Argus) Error FileOperationFailed:  (propagating from libs/rpc_socket_server/ServerSocketManager.cpp, function handleRequest(), line 174)
(Argus) Error FileOperationFailed: Message processing failed (id = 612210012) (in libs/rpc_socket_server/ServerWorkerThread.cpp, function run(), line 143)
SCF: Error InvalidState: Timeout!! Skipping requests on sensor GUID 23, capture sequence ID = 20475838 draining session frameEnd events 3
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 646)
(Argus) Error FileOperationFailed: Message processing failed (id = 532370486) (in libs/rpc_socket_server/ServerWorkerThread.cpp, function run(), line 143)
SCF: Error InvalidState: Timeout!! Skipping requests on sensor GUID 16, capture sequence ID = 20475296 draining session frameStart events 2
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameStart(), line 529)
SCF: Error InvalidState: Timeout!! Skipping requests on sensor GUID 15, capture sequence ID = 10237319 draining session frameEnd events 3
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 646)
SCF: Error Timeout: Sending critical error event for Session 1
 (in src/api/Session.cpp, function sendErrorEvent(), line 1039)
SCF: Error Timeout: Sending critical error event for Session 13
 (in src/api/Session.cpp, function sendErrorEvent(), line 1039)
SCF: Error BadParameter: CC has already been disposed (in src/components/CaptureContainerManager.cpp, function dispose(), line 161)
SCF: Error InvalidState: Sensor 13 already in same state
 (in src/services/capture/CaptureServiceDeviceSensor.cpp, function setErrorState(), line 100)
SCF: Error InvalidState: Timeout!! Skipping requests on sensor GUID 13, capture sequence ID = 10237317 draining session frameEnd events 3
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 646)
SCF: Error InvalidState: Sensor 0 already in same state
 (in src/services/capture/CaptureServiceDeviceSensor.cpp, function setErrorState(), line 100)
SCF: Error Timeout: Sending critical error event for Session 16
 (in src/api/Session.cpp, function sendErrorEvent(), line 1039)
SCF: Error InvalidState: Sensor GUID 0 is in error state. Skipping requests, capture sequence ID = 20476207 continue draining session frameStart events 2
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameStart(), line 543)
(Argus) Error FileOperationFailed: Socket write failed: Broken pipe (in src/rpc/socket/common/SocketUtils.cpp, function writeSocket(), line 49)
SCF: Error InvalidState: Sensor GUID 0 is in error state. Skipping requests, capture sequence ID = 20476208 continue draining session frameStart events 1
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameStart(), line 543)
SCF: Error BadParameter: CC has already been disposed (in src/components/CaptureContainerManager.cpp, function dispose(), line 161)
SCF: Error Timeout: Sending critical error event for Session 15
 (in src/api/Session.cpp, function sendErrorEvent(), line 1039)
(Argus) Error FileOperationFailed: Socket write failed: Broken pipe (in src/rpc/socket/common/SocketUtils.cpp, function writeSocket(), line 49)
SCF: Error InvalidState: Sensor GUID 23 is in error state. Skipping requests, capture sequence ID = 20475841 continue draining session frameStart events 2
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameStart(), line 543)
SCF: Error InvalidState: Sensor 16 already in same state
 (in src/services/capture/CaptureServiceDeviceSensor.cpp, function setErrorState(), line 100)
SCF: Error InvalidState: Timeout!! Skipping requests on sensor GUID 16, capture sequence ID = 20475296 draining session frameEnd events 3
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 646)
SCF: Error InvalidState: Sensor 13 already in same state
 (in src/services/capture/CaptureServiceDeviceSensor.cpp, function setErrorState(), line 100)
SCF: Error InvalidState: Sensor 1 already in same state
 (in src/services/capture/CaptureServiceDeviceSensor.cpp, function setErrorState(), line 100)
SCF: Error InvalidState: Timeout!! Skipping requests on sensor GUID 1, capture sequence ID = 20476203 draining session frameEnd events 3
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 646)
(Argus) Error FileOperationFailed:  (propagating from libs/rpc_socket_server/ServerSocketManager.cpp, function sendResponse(), line 92)
(Argus) Error FileOperationFailed:  (propagating from libs/rpc_socket_server/ServerSocketManager.cpp, function handleRequest(), line 174)
SCF: Error InvalidState: Sensor GUID 1 is in error state. Skipping requests, capture sequence ID = 20476204 continue draining session frameStart events 2
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameStart(), line 543)
SCF: Error InvalidState: Sensor 23 already in same state
 (in src/services/capture/CaptureServiceDeviceSensor.cpp, function setErrorState(), line 100)
SCF: Error InvalidState: Sensor GUID 16 is in error state. Skipping requests, capture sequence ID = 20475297 continue draining session frameStart events 2
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameStart(), line 543)
SCF: Error Timeout: Sending critical error event for Session 15
 (in src/api/Session.cpp, function sendErrorEvent(), line 1039)
(Argus) Error FileOperationFailed: Socket write failed: Broken pipe (in src/rpc/socket/common/SocketUtils.cpp, function writeSocket(), line 49)
(Argus) Error FileOperationFailed:  (propagating from libs/rpc_socket_server/ServerSocketManager.cpp, function sendResponse(), line 92)
SCF: Error InvalidState: Sensor GUID 15 is in error state. Skipping requests, capture sequence ID = 10237321 continue draining session frameStart events 2
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameStart(), line 543)
SCF: Error InvalidState: Sensor GUID 13 is in error state. Skipping requests, capture sequence ID = 10237318 continue draining session frameStart events 2
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameStart(), line 543)
=== gst_pipeline_video_node[338]: WARNING: CameraProvider was not destroyed before client connection terminated.=== gst_pipeline_video_node[338]:          The client may have abnormally terminated. Destroying CameraProvider...=== gst_pipeline_video_node[338]: CameraProvider destroyed (0xfffe70000c70)=== gst_pipeline_video_node[338]: WARNING: Cleaning up 2 outstanding requests...(Argus) Error FileOperationFailed: Message processing failed (id = 612210013) (in libs/rpc_socket_server/ServerWorkerThread.cpp, function run(), line 143)
SCF: Error InvalidState: Timeout!! Skipping requests on sensor GUID 23, capture sequence ID = 20475839 draining session frameEnd events 4
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 646)
SCF: Error InvalidState: Sensor GUID 22 is in error state. Skipping requests, capture sequence ID = 20475843 continue draining session frameStart events 2
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameStart(), line 543)
SCF: Error BadParameter: CC has already been disposed (in src/components/CaptureContainerManager.cpp, function dispose(), line 161)
SCF: Error InvalidState: Sensor 15 already in same state
 (in src/services/capture/CaptureServiceDeviceSensor.cpp, function setErrorState(), line 100)
SCF: Error BadParameter: CC has already been disposed (in src/components/CaptureContainerManager.cpp, function dispose(), line 161)
SCF: Error InvalidState: Timeout!! Skipping requests on sensor GUID 13, capture sequence ID = 10237318 draining session frameEnd events 2
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 646)
SCF: Error InvalidState: Sensor 22 already in same state
 (in src/services/capture/CaptureServiceDeviceSensor.cpp, function setErrorState(), line 100)
SCF: Error InvalidState: Timeout!! Skipping requests on sensor GUID 22, capture sequence ID = 20475841 draining session frameEnd events 4
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 646)
SCF: Error InvalidState: Timeout!! Skipping requests on sensor GUID 0, capture sequence ID = 20476205 draining session frameEnd events 4
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 646)
=== gst_pipeline_video_node[338]: WARNING: Cleaning up 2 outstanding stream settings...=== gst_pipeline_video_node[338]: WARNING: Cleaning up 2 outstanding queues...SCF: Error InvalidState: Timeout!! Skipping requests on sensor GUID 15, capture sequence ID = 10237320 draining session frameEnd events 3
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 646)
=== gst_pipeline_video_node[338]: WARNING: Cleaning up 2 outstanding sessions...(Argus) Error FileOperationFailed:  (propagating from libs/rpc_socket_server/ServerSocketManager.cpp, function sendResponse(), line 92)
(Argus) Error FileOperationFailed:  (propagating from libs/rpc_socket_server/ServerSocketManager.cpp, function handleRequest(), line 174)
(Argus) Error FileOperationFailed:  (propagating from libs/rpc_socket_server/ServerSocketManager.cpp, function handleRequest(), line 174)
(Argus) Error FileOperationFailed: Message processing failed (id = 266153466) (in libs/rpc_socket_server/ServerWorkerThread.cpp, function run(), line 143)
(Argus) Error FileOperationFailed: Socket write failed: Broken pipe (in src/rpc/socket/common/SocketUtils.cpp, function writeSocket(), line 49)
(Argus) Error FileOperationFailed:  (propagating from libs/rpc_socket_server/ServerSocketManager.cpp, function sendResponse(), line 92)
(Argus) Error FileOperationFailed:  (propagating from libs/rpc_socket_server/ServerSocketManager.cpp, function handleRequest(), line 174)
(Argus) Error FileOperationFailed: Message processing failed (id = 532352073) (in libs/rpc_socket_server/ServerWorkerThread.cpp, function run(), line 143)
(Argus) Error FileOperationFailed: Message processing failed (id = 266153465) (in libs/rpc_socket_server/ServerWorkerThread.cpp, function run(), line 143)
SCF: Error InvalidState: Ignoring error as sensor is in error and ViCsi should be closing!
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function startCaptureInternal(), line 874)
SCF: Error InvalidState: Sensor GUID 0 is in error state. Skipping requests, capture sequence ID = 20476209 continue draining session frameStart events 1
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameStart(), line 543)
SCF: Error InvalidState: Ignoring error as sensor is in error and ViCsi should be closing!
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function startCaptureInternal(), line 874)
SCF: Error InvalidState: Ignoring error as sensor is in error and ViCsi should be closing!
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function startCaptureInternal(), line 874)
SCF: Error BadParameter: CC has already been disposed (in src/components/CaptureContainerManager.cpp, function dispose(), line 161)
SCF: Error BadParameter: CC has already been disposed (in src/components/CaptureContainerManager.cpp, function dispose(), line 161)
SCF: Error BadParameter: CC has already been disposed (in src/components/CaptureContainerManager.cpp, function dispose(), line 161)
SCF: Error BadParameter: CC has already been disposed (in src/components/CaptureContainerManager.cpp, function dispose(), line 161)
SCF: Error BadParameter: CC has already been disposed (in src/components/CaptureContainerManager.cpp, function dispose(), line 161)
SCF: Error BadParameter: CC has already been disposed (in src/components/CaptureContainerManager.cpp, function dispose(), line 161)
SCF: Error BadParameter: CC has already been disposed (in src/components/CaptureContainerManager.cpp, function dispose(), line 161)
SCF: Error BadParameter: CC has already been disposed (in src/components/CaptureContainerManager.cpp, function dispose(), line 161)
SCF: Error BadParameter: CC has already been disposed (in src/components/CaptureContainerManager.cpp, function dispose(), line 161)
SCF: Error BadParameter: CC has already been disposed (in src/components/CaptureContainerManager.cpp, function dispose(), line 161)
SCF: Error BadParameter: CC has already been disposed (in src/components/CaptureContainerManager.cpp, function dispose(), line 161)
SCF: Error InvalidState: Timeout!! Skipping requests on sensor GUID 1, capture sequence ID = 20476205 draining session frameStart events 2
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameStart(), line 529)
SCF: Error InvalidState: Timeout!! Skipping requests on sensor GUID 16, capture sequence ID = 20475298 draining session frameStart events 2
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameStart(), line 529)
SCF: Error InvalidState: Sensor 23 already in same state
 (in src/services/capture/CaptureServiceDeviceSensor.cpp, function setErrorState(), line 100)
SCF: Error InvalidState: Sensor 16 already in same state
 (in src/services/capture/CaptureServiceDeviceSensor.cpp, function setErrorState(), line 100)
SCF: Error InvalidState: Timeout!! Skipping requests on sensor GUID 13, capture sequence ID = 10237319 draining session frameStart events 1
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameStart(), line 529)
SCF: Error InvalidState: Timeout!! Skipping requests on sensor GUID 14, capture sequence ID = 20475294 draining session frameEnd events 2
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 646)
SCF: Error InvalidState: Timeout!! Skipping requests on sensor GUID 0, capture sequence ID = 20476206 draining session frameEnd events 5
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 646)
SCF: Error InvalidState: Sensor 15 already in same state
 (in src/services/capture/CaptureServiceDeviceSensor.cpp, function setErrorState(), line 100)
SCF: Error InvalidState: Timeout!! Skipping requests on sensor GUID 15, capture sequence ID = 10237321 draining session frameEnd events 3
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 646)
SCF: Error InvalidState: Sensor 22 already in same state
 (in src/services/capture/CaptureServiceDeviceSensor.cpp, function setErrorState(), line 100)
SCF: Error InvalidState: Timeout!! Skipping requests on sensor GUID 22, capture sequence ID = 20475842 draining session frameEnd events 3
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 646)
SCF: Error InvalidState: Sensor GUID 16 is in error state. Skipping requests, capture sequence ID = 20475299 continue draining session frameStart events 1
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameStart(), line 543)
SCF: Error InvalidState: Timeout!! Skipping requests on sensor GUID 23, capture sequence ID = 20475842 draining session frameStart events 2
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameStart(), line 529)
SCF: Error InvalidState: Timeout!! Skipping requests on sensor GUID 16, capture sequence ID = 20475297 draining session frameEnd events 3
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 646)
SCF: Error InvalidState: Sensor 13 already in same state
 (in src/services/capture/CaptureServiceDeviceSensor.cpp, function setErrorState(), line 100)
SCF: Error InvalidState: Sensor 1 already in same state
 (in src/services/capture/CaptureServiceDeviceSensor.cpp, function setErrorState(), line 100)
SCF: Error InvalidState: Timeout!! Skipping requests on sensor GUID 1, capture sequence ID = 20476204 draining session frameEnd events 3
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 646)
SCF: Error InvalidState: Timeout!! Skipping requests on sensor GUID 22, capture sequence ID = 20475844 draining session frameStart events 1
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameStart(), line 529)
SCF: Error InvalidState: Timeout!! Skipping requests on sensor GUID 15, capture sequence ID = 10237322 draining session frameStart events 2
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameStart(), line 529)
SCF: Error InvalidState: Sensor GUID 1 is in error state. Skipping requests, capture sequence ID = 20476206 continue draining session frameStart events 1
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameStart(), line 543)
SCF: Error InvalidState: Timeout!! Skipping requests on sensor GUID 23, capture sequence ID = 20475840 draining session frameEnd events 4
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 646)
SCF: Error InvalidState: Timeout!! Skipping requests on sensor GUID 13, capture sequence ID = 10237319 draining session frameEnd events 1
 (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 646)

However, there is a reasonable chance the crash occurred because my DES board was suddenly powered off. I don’t have solid evidence, but 189 hours of uptime is still acceptable for my use case.

Unfortunately, the Yocto system still fails as described above, so I would appreciate any help or suggestions you can provide.

Hello,
Any updates from your side, guys?

That’s weird shouldn’t get timeout while enable the infinite timeout.

For a case I got something wrong I have just verified my md5 hashes and here they are:

root@jetson-agx-orin-devkit:/usr/lib# md5sum libnvargus.so
d6d03cea9e0d6aeb8d6cf95bbd939914  libnvargus.so
root@jetson-agx-orin-devkit:/usr/lib# md5sum libnvargus_socketserver.so
7b13df069b247a9dbe9f98aa1a815b7c  libnvargus_socketserver.so
root@jetson-agx-orin-devkit:/usr/lib# md5sum libnvscf.so
f1378bec9618376ac0127433bfc78492  libnvscf.so
root@jetson-agx-orin-devkit:/usr/lib# enableCamInfiniteTimeout=1 nvargus-daemon
=== NVIDIA Libargus Camera Service (0.99.33)=== Listening for connections...

Look exactly as hashes for libs you gave me
Will appreciate any support

Please kill all nvargus-daemon by kill command then issue below command to make sure of it.

ps aux | grep -i nvargus-daemon
sudo kill <PID of nvargus-daemon>
sudo enableCamInfiniteTimeout=1 nvargus-daemon

Yep that is what I actually do every time

Here is a one more crash report I got yesterday after only 40 minutes of uptime. All nv libs are patched

Logs
SCF: Error ResourceAlreadyInUse:  (propagating from src/services/capture/FusaCaptureViCsiHw.cpp, function startCaptureInternal(), line 877)
SCF: Error ResourceAlreadyInUse:  (propagating from src/services/capture/CaptureRecord.cpp, function doCSItoMemCapture(), line 547)
SCF: Error ResourceAlreadyInUse:  (propagating from src/services/capture/CaptureRecord.cpp, function issueCapture(), line 490)
SCF: Error ResourceAlreadyInUse:  (propagating from src/services/capture/CaptureServiceDevice.cpp, function issueCaptures(), line 1565)
SCF: Error ResourceAlreadyInUse:  (propagating from src/services/capture/CaptureServiceDevice.cpp, function issueCaptures(), line 1394)
SCF: Error ResourceAlreadyInUse:  (propagating from src/common/Utils.cpp, function workerThread(), line 125)
SCF: Error ResourceAlreadyInUse: Worker thread CaptureScheduler frameStart failed (in src/common/Utils.cpp, function workerThread(), line 144)
SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 1039)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 413)
SCF: Error InvalidState:  (propagating from src/components/stages/MemoryToISPCaptureStage.cpp, function doHandleRequest(), line 148)
SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)
SCF: Error Timeout:  (propagating from src/api/Buffer.cpp, function waitForUnlock(), line 644)
SCF: Error Timeout:  (propagating from src/components/CaptureContainerImpl.cpp, function returnBuffer(), line 430)
SCF: Error InvalidState: Sending critical error event for Session 16
 (in src/api/Session.cpp, function sendErrorEvent(), line 1039)
SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 1039)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 413)
SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 1039)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 413)
SCF: Error InvalidState:  (propagating from src/components/stages/MemoryToISPCaptureStage.cpp, function doHandleRequest(), line 148)
SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)
SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 1039)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 413)
SCF: Error InvalidState:  (propagating from src/components/stages/MemoryToISPCaptureStage.cpp, function doHandleRequest(), line 148)
SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)
SCF: Error InvalidState: Sending critical error event for Session 22
 (in src/api/Session.cpp, function sendErrorEvent(), line 1039)
assignAllBuffersFromStream if ii 0 stream 0xfffe68001310 and buffer (nil) 281472740790656meout 4af0000077f
assignAllBuffersFromStream if ii 0 stream 0xfffe6c001310 and buffer (nil) 281472706974080meout 4af0000077f
SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 1039)
SCF: Error InvalidState: Sending critical error event for Session 1
 (in src/api/Session.cpp, function sendErrorEvent(), line 1039)
assignAllBuffersFromStream if ii 0 stream 0xfffe5c0020d0 and buffer (nil) 281472602509696meout 4af0000077f
SCF: Error InvalidState:  (propagating from src/components/stages/MemoryToISPCaptureStage.cpp, function doHandleRequest(), line 148)
SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)
SCF: Error InvalidState: Sending critical error event for Session 14
 (in src/api/Session.cpp, function sendErrorEvent(), line 1039)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 413)
SCF: Error InvalidState:  (propagating from src/components/stages/MemoryToISPCaptureStage.cpp, function doHandleRequest(), line 148)
SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)
SCF: Error InvalidState: Sending critical error event for Session 23
 (in src/api/Session.cpp, function sendErrorEvent(), line 1039)
SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 1039)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 413)
SCF: Error InvalidState:  (propagating from src/components/stages/MemoryToISPCaptureStage.cpp, function doHandleRequest(), line 148)
SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)
SCF: Error InvalidState: Sending critical error event for Session 13
 (in src/api/Session.cpp, function sendErrorEvent(), line 1039)
SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 1039)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 413)
SCF: Error InvalidState:  (propagating from src/components/stages/SensorCaptureStage.cpp, function doHandleRequest(), line 91)
SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)
SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 1039)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 413)
SCF: Error InvalidState:  (propagating from src/components/stages/SensorCaptureStage.cpp, function doHandleRequest(), line 91)
SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)
SCF: Error InvalidState: Sending critical error event for Session 0
 (in src/api/Session.cpp, function sendErrorEvent(), line 1039)
SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 1039)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 413)
SCF: Error InvalidState:  (propagating from src/components/stages/SensorCaptureStage.cpp, function doHandleRequest(), line 91)
SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)
SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 1039)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 413)
SCF: Error InvalidState:  (propagating from src/components/stages/SensorCaptureStage.cpp, function doHandleRequest(), line 91)
SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)
=== gst_pipeline_video_node[276]: Connection closed (FFFF7C3DB8E0)=== gst_pipeline_video_node[276]: WARNING: CameraProvider was not destroyed before client connection terminated.=== gst_pipeline_video_node[276]:          The client may have abnormally terminated. Destroying CameraProvider...=== gst_pipeline_video_node[276]: CameraProvider destroyed (0xffff74d52ee0)=== gst_pipeline_video_node[276]: WARNING: Cleaning up 2 outstanding requests...=== gst_pipeline_video_node[276]: WARNING: Cleaning up 2 outstanding stream settings...=== gst_pipeline_video_node[276]: WARNING: Cleaning up 2 outstanding queues...=== gst_pipeline_video_node[276]: WARNING: Cleaning up 2 outstanding sessions...user@media:/opt/robot/calibration$

Any high loading on the system?

Could you have have camera run on dedicate CPU by taskset command to try.

Thanks

Hello,
I have finally worked that out, here is a missing fix which Jetson Linux 36.4.3 has and Yocto doesn’t:

Patch
// add semaphore to avoid multi-cam race condition

---
 .../platform/tegra/rtcpu/capture-ivc-priv.h   | 17 ++++++++--
 drivers/platform/tegra/rtcpu/capture-ivc.c    | 34 +++++++++++++++++--
 2 files changed, 45 insertions(+), 6 deletions(-)

diff --git a/nvidia-oot/drivers/platform/tegra/rtcpu/capture-ivc-priv.h b/nvidia-oot/drivers/platform/tegra/rtcpu/capture-ivc-priv.h
index e503f0af..c6eac69a 100644
--- a/nvidia-oot/drivers/platform/tegra/rtcpu/capture-ivc-priv.h
+++ b/nvidia-oot/drivers/platform/tegra/rtcpu/capture-ivc-priv.h
@@ -1,6 +1,16 @@
-/* SPDX-License-Identifier: GPL-2.0-only */
-/*
- * Copyright (c) 2022-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+/* SPDX-License-Identifier: LicenseRef-NvidiaProprietary
+ *
+ * SPDX-FileCopyrightText: Copyright (c) 2017-2025 NVIDIA CORPORATION.
+ *                         All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
  */
 
 #ifndef __CAPTURE_IVC_PRIV_H__
@@ -26,6 +36,7 @@ struct tegra_capture_ivc_cb_ctx {
 	tegra_capture_ivc_cb_func cb_func;
 	/** Private context of a VI/ISP capture context */
 	const void *priv_context;
+	struct semaphore sem_ch;
 };
 
 /**
diff --git a/nvidia-oot/drivers/platform/tegra/rtcpu/capture-ivc.c b/nvidia-oot/drivers/platform/tegra/rtcpu/capture-ivc.c
index 14e1ba3e..3287ae1a 100644
--- a/nvidia-oot/drivers/platform/tegra/rtcpu/capture-ivc.c
+++ b/nvidia-oot/drivers/platform/tegra/rtcpu/capture-ivc.c
@@ -1,6 +1,20 @@
-// SPDX-License-Identifier: GPL-2.0
-// Copyright (c) 2022-2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
-
+// SPDX-License-Identifier: LicenseRef-NvidiaProprietary
+/*
+ * SPDX-FileCopyrightText: Copyright (c) 2017-2025 NVIDIA CORPORATION.
+ *                         All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
 #include <nvidia/conftest.h>
 
 #include <linux/tegra-capture-ivc.h>
@@ -17,12 +31,17 @@
 #include <linux/kthread.h>
 #include <linux/sched.h>
 #include <linux/version.h>
+#include <linux/semaphore.h>
 #include <asm/barrier.h>
 
 #include <trace/events/tegra_capture.h>
 
 #include "capture-ivc-priv.h"
 
+/* Timeout for acquiring channel-id */
+#define TIMEOUT_ACQUIRE_CHANNEL_ID 120
+
+
 static int tegra_capture_ivc_tx_(struct tegra_capture_ivc *civc,
 				const void *req, size_t len)
 {
@@ -184,6 +203,11 @@ int tegra_capture_ivc_notify_chan_id(uint32_t chan_id, uint32_t trans_id)
 
 	civc = __scivc_control;
 
+	if (down_timeout(&civc->cb_ctx[chan_id].sem_ch,
+				TIMEOUT_ACQUIRE_CHANNEL_ID)) {
+		return -EBUSY;
+	}
+
 	mutex_lock(&civc->cb_ctx_lock);
 
 	if (WARN(civc->cb_ctx[trans_id].cb_func == NULL,
@@ -288,6 +312,7 @@ int tegra_capture_ivc_unregister_control_cb(uint32_t id)
 	civc->cb_ctx[id].priv_context = NULL;
 
 	mutex_unlock(&civc->cb_ctx_lock);
+	up(&civc->cb_ctx[id].sem_ch);
 
 	/*
 	 * If it's trans_id, client encountered an error before or during
@@ -454,6 +479,9 @@ static int tegra_capture_ivc_probe(struct tegra_ivc_channel *chan)
 	mutex_init(&civc->cb_ctx_lock);
 	mutex_init(&civc->ivc_wr_lock);
 
+	for (i = 0; i < TOTAL_CHANNELS; i++)
+		sema_init(&civc->cb_ctx[i].sem_ch, 1);
+
 	/* Initialize kworker */
 	kthread_init_work(&civc->work, tegra_capture_ivc_worker);
 
-- 
2.25.1

This one and the one from here have finally fixed my long run crashes. Hope it might help someone else.

1 Like

Nope, I was wrong
Seems like even this one hasn’t helped, so I still need help with resolving this issue

Replace the /usr/lib/aarch64-linux-gnu/tegra/libnvfusacap.so with attached file to verify.

libnvfusacap.so.r36.4.dis_notify (231.9 KB)