Nvargus-daemon crashes when 4 camera 4k@60 capture pipeline stops on AGX Xavier JP4.5.1

Hello,

I am experiencing an odd issue with nvarguscamera. I am using an NVIDIA AGX Xavier devkit with JP4.5.1.

I am capturing from multiple sensors (IMX415) at 4K resolution. The capture goes great. I can capture from 4 cameras 4k@60fps with no issues at all. However, when I kill (ctrl+c) the pipeline, nvargus-daemon crashes with the following error:

NvCaptureStatusErrorDecode Stream 2.0 failed: sof_ts 6353057060736 eof_ts 6353066758976 frame 173 error 0 data 0x00000000
NvCaptureStatusErrorDecode Capture-Error: UNKNOWN (0x00000000)
(NvCapture) Error InvalidState: Channel is in error state, reset required (in /dvs/git/dirty/git-master_linux/camera/capture/nvcapture/capture.c, function NvCaptureRequestGetAttribute(), line 1870)
SCF: Error InvalidState:  (propagating from src/services/capture/NvCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 913)
SCF: Error InvalidState: Capture error with status 0 (channel 0) (in src/services/capture/NvCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 922)
SCF: Error InvalidState: Sequence order error (124 received, 0 expected, channel 0) (in src/services/capture/NvCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 929)
(NvCapture) Error InvalidState: Channel in error state, reset required (in /dvs/git/dirty/git-master_linux/camera/capture/nvcapture/capture.c, function NvCaptureReleaseRequest(), line 649)
SCF: Error InvalidState:  (propagating from src/services/capture/NvCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 940)
SCF: Error InvalidState: Sending critical error event (in src/api/Session.cpp, function sendErrorEvent(), line 992)
SCF: Error InvalidState: Something went wrong with waiting on frame end (in src/services/capture/NvCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 983)
SCF: Error InvalidState: Something went wrong with waiting on frame end (in src/services/capture/NvCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 983)
NvCaptureStatusErrorDecode Stream 0.0 failed: sof_ts 6353063370112 eof_ts 6353066758976 frame 150 error 0 data 0x00000000
NvCaptureStatusErrorDecode Capture-Error: UNKNOWN (0x00000000)
(NvCapture) Error InvalidState: Channel is in error state, reset required (in /dvs/git/dirty/git-master_linux/camera/capture/nvcapture/capture.c, function NvCaptureRequestGetAttribute(), line 1870)
SCF: Error InvalidState:  (propagating from src/services/capture/NvCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 913)
SCF: Error InvalidState: Capture error with status 0 (channel 0) (in src/services/capture/NvCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 922)
SCF: Error InvalidState: Sequence order error (124 received, 0 expected, channel 0) (in src/services/capture/NvCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 929)
(NvCapture) Error InvalidState: Channel in error state, reset required (in /dvs/git/dirty/git-master_linux/camera/capture/nvcapture/capture.c, function NvCaptureReleaseRequest(), line 649)
SCF: Error InvalidState:  (propagating from src/services/capture/NvCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 940)
NvCaptureStatusErrorDecode Stream 4.0 failed: sof_ts 6353058082048 eof_ts 6353066759040 frame 138 error 0 data 0x00000000
NvCaptureStatusErrorDecode Capture-Error: UNKNOWN (0x00000000)
(NvCapture) Error InvalidState: Channel is in error state, reset required (in /dvs/git/dirty/git-master_linux/camera/capture/nvcapture/capture.c, function NvCaptureRequestGetAttribute(), line 1870)
SCF: Error InvalidState:  (propagating from src/services/capture/NvCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 913)
SCF: Error InvalidState: Capture error with status 0 (channel 0) (in src/services/capture/NvCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 922)
SCF: Error InvalidState: Sequence order error (122 received, 0 expected, channel 0) (in src/services/capture/NvCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 929)
(NvCapture) Error InvalidState: Channel in error state, reset required (in /dvs/git/dirty/git-master_linux/camera/capture/nvcapture/capture.c, function NvCaptureReleaseRequest(), line 649)
SCF: Error InvalidState:  (propagating from src/services/capture/NvCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 940)
SCF: Error InvalidState: Sending critical error event (in src/api/Session.cpp, function sendErrorEvent(), line 992)
SCF: Error InvalidState: Something went wrong with waiting on frame end (in src/services/capture/NvCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 983)
SCF: Error InvalidState: Something went wrong with waiting on frame end (in src/services/capture/NvCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 983)
SCF: Error InvalidState: Timeout waiting on frame start sensor guid 4, capture sequence ID = 123 (in src/services/capture/NvCaptureViCsiHw.cpp, function waitCsiFrameStart(), line 1010)
SCF: Error InvalidState: Timeout waiting on frame start sensor guid 0, capture sequence ID = 125 (in src/services/capture/NvCaptureViCsiHw.cpp, function waitCsiFrameStart(), line 1010)
SCF: Error Timeout:  (propagating from src/services/capture/CaptureServiceDeviceViCsi.cpp, function waitCompletion(), line 339)
SCF: Error Timeout:  (propagating from src/services/capture/CaptureServiceDevice.cpp, function pause(), line 938)
SCF: Error Timeout: During capture abort, syncpoint wait timeout waiting for current frame to finish (in src/services/capture/CaptureServiceDevice.cpp, function handleCancelSourceRequests(), line 1032)
SCF: Error InvalidState: Timeout waiting on frame start sensor guid 2, capture sequence ID = 125 (in src/services/capture/NvCaptureViCsiHw.cpp, function waitCsiFrameStart(), line 1010)
PowerServiceCore:handleRequests: timePassed = 1524
(NvCapture) Error InvalidState: Channel in error state, reset required (in /dvs/git/dirty/git-master_linux/camera/capture/nvcapture/capture.c, function NvCaptureGetRequest(), line 693)
SCF: Error InvalidState:  (propagating from src/services/capture/NvCaptureViCsiHw.cpp, function startCaptureInternal(), line 582)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureRecord.cpp, function doCSItoMemCapture(), line 532)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureRecord.cpp, function issueCapture(), line 469)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureServiceDevice.cpp, function issueCaptures(), line 1293)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureServiceDevice.cpp, function issueBubbleFillCapturesIfNeeded(), line 676)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureServiceDevice.cpp, function issueCaptures(), line 1135)
SCF: Error InvalidState:  (propagating from src/common/Utils.cpp, function workerThread(), line 116)
SCF: Error InvalidState: Worker thread CaptureScheduler frameStart failed (in src/common/Utils.cpp, function workerThread(), line 133)
SCF: Error InvalidState: Sending critical error event (in src/api/Session.cpp, function sendErrorEvent(), line 992)
SCF: Error InvalidState: Something went wrong with waiting on frame end (in src/services/capture/NvCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 983)
SCF: Error InvalidState: Something went wrong with waiting on frame end (in src/services/capture/NvCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 983)
SCF: Error InvalidState: Something went wrong with waiting on frame start (in src/services/capture/NvCaptureViCsiHw.cpp, function waitCsiFrameStart(), line 1069)
SCF: Error InvalidState: Something went wrong with waiting on frame start (in src/services/capture/NvCaptureViCsiHw.cpp, function waitCsiFrameStart(), line 1069)
SCF: Error InvalidState: Something went wrong with waiting on frame start (in src/services/capture/NvCaptureViCsiHw.cpp, function waitCsiFrameStart(), line 1069)
SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 908)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 395)
SCF: Error InvalidState:  (propagating from src/components/stages/SensorCaptureStage.cpp, function doHandleRequest(), line 87)
SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)
SCF: Error Timeout:  (propagating from src/api/Buffer.cpp, function waitForUnlock(), line 637)
SCF: Error Timeout:  (propagating from src/components/CaptureContainerImpl.cpp, function returnBuffer(), line 358)
waitForIdleLocked remaining request 220 
waitForIdleLocked remaining request 217 
SCF: Error Timeout: waitForIdle() timed out (in src/api/Session.cpp, function waitForIdleLocked(), line 922)
SCF: Error Timeout:  (propagating from src/api/Session.cpp, function abortCaptures(), line 888)

At that poing nvargus-daemon keeps showing the same error over and over again. Then I am forced to kill the deamon with ctrl+c to make it stop.

The issue does not show a defined behavior. Sometimes happens, other times don’t. However, the rate at which the issue occurs seems to be related with the resolution, frame-rate and number of cameras used. If I capture 3 cameras 4k@60 the issue don’t seem to appear. But, when I capture from 4 cameras with the same resolution and frame-rate, the issue starts to show more. If I capture 3 cameras 4k@90, the issue also shows more frequently. And if I capture at 30fps, the issue seems to stop.

I believe the issue could be related with the time it takes for nvargus to free the resources before the camera is powered off. However I am not sure. I would appreciate it very much if you could share with me some recommendations.

best regards,
Andres Campos
Embedded Software Engineer

Did you try boost the nvcsi/vi/isp clocks to verify?
What if run the nvargus-daemon as infinite timeout mode?

Hello @ShaneCCC ,

Thanks for your help.

Even when running nvargus-daemon with both boosting the clocks and with infinite time-out, the daemon still crashes:

SCF: Error CaptureAborted:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 137)
NvCaptureStatusErrorDecode Stream 0.0 failed: sof_ts 1202328950144 eof_ts 1202335245952 frame 75 error 0 data 0x00000000
NvCaptureStatusErrorDecode Capture-Error: UNKNOWN (0x00000000)
(NvCapture) Error InvalidState: Channel is in error state, reset required (in /dvs/git/dirty/git-master_linux/camera/capture/nvcapture/capture.c, function NvCaptureRequestGetAttribute(), line 1870)
SCF: Error InvalidState:  (propagating from src/services/capture/NvCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 913)
SCF: Error InvalidState: Capture error with status 0 (channel 0) (in src/services/capture/NvCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 922)
SCF: Error InvalidState: Sequence order error (761 received, 0 expected, channel 0) (in src/services/capture/NvCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 929)
(NvCapture) Error InvalidState: Channel in error state, reset required (in /dvs/git/dirty/git-master_linux/camera/capture/nvcapture/capture.c, function NvCaptureReleaseRequest(), line 649)
SCF: Error InvalidState:  (propagating from src/services/capture/NvCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 940)
SCF: Error InvalidState: Sending critical error event (in src/api/Session.cpp, function sendErrorEvent(), line 992)
SCF: Error InvalidState: Something went wrong with waiting on frame end (in src/services/capture/NvCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 983)
SCF: Error InvalidState: Something went wrong with waiting on frame end (in src/services/capture/NvCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 983)
NvCaptureStatusErrorDecode Stream 4.0 failed: sof_ts 1202333206592 eof_ts 1202335245952 frame 58 error 0 data 0x00000000
NvCaptureStatusErrorDecode Capture-Error: UNKNOWN (0x00000000)
(NvCapture) Error InvalidState: Channel is in error state, reset required (in /dvs/git/dirty/git-master_linux/camera/capture/nvcapture/capture.c, function NvCaptureRequestGetAttribute(), line 1870)
(NvCapture) Error InvalidState: Channel is in error state, reset required (in /dvs/git/dirty/git-master_linux/camera/capture/nvcapture/capture.c, function NvCaptureRequestGetAttribute(), line 1870)
SCF: Error InvalidState:  (propagating from src/services/capture/NvCaptureViCsiHw.cpp, function startCaptureInternal(), line 781)
NvCaptureStatusErrorDecode Stream 2.0 failed: sof_ts 1202330577440 eof_ts 1202335245952 frame 23 error 0 data 0x00000000
NvCaptureStatusErrorDecode Capture-Error: UNKNOWN (0x00000000)
(NvCapture) Error InvalidState: Channel is in error state, reset required (in /dvs/git/dirty/git-master_linux/camera/capture/nvcapture/capture.c, function NvCaptureRequestGetAttribute(), line 1870)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureRecord.cpp, function doCSItoMemCapture(), line 532)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureRecord.cpp, function issueCapture(), line 469)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureServiceDevice.cpp, function issueCaptures(), line 1293)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureServiceDevice.cpp, function issueCaptures(), line 1124)
SCF: Error InvalidState:  (propagating from src/services/capture/NvCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 913)
SCF: Error InvalidState: Capture error with status 0 (channel 0) (in src/services/capture/NvCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 922)
SCF: Error InvalidState: Sequence order error (760 received, 0 expected, channel 0) (in src/services/capture/NvCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 929)
(NvCapture) Error InvalidState: Channel in error state, reset required (in /dvs/git/dirty/git-master_linux/camera/capture/nvcapture/capture.c, function NvCaptureReleaseRequest(), line 649)
SCF: Error InvalidState:  (propagating from src/services/capture/NvCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 940)
SCF: Error InvalidState: Sending critical error event (in src/api/Session.cpp, function sendErrorEvent(), line 992)
SCF: Error InvalidState: Something went wrong with waiting on frame end (in src/services/capture/NvCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 983)
SCF: Error InvalidState: Something went wrong with waiting on frame end (in src/services/capture/NvCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 983)
SCF: Error InvalidState:  (propagating from src/services/capture/NvCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 913)
SCF: Error InvalidState: Capture error with status 0 (channel 0) (in src/services/capture/NvCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 922)
SCF: Error InvalidState: Sequence order error (763 received, 0 expected, channel 0) (in src/services/capture/NvCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 929)
(NvCapture) Error InvalidState: Channel in error state, reset required (in /dvs/git/dirty/git-master_linux/camera/capture/nvcapture/capture.c, function NvCaptureReleaseRequest(), line 649)
SCF: Error InvalidState:  (propagating from src/services/capture/NvCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 940)
SCF: Error InvalidState: Sending critical error event (in src/api/Session.cpp, function sendErrorEvent(), line 992)
SCF: Error InvalidState: Something went wrong with waiting on frame end (in src/services/capture/NvCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 983)
SCF: Error InvalidState:  (propagating from src/common/Utils.cpp, function workerThread(), line 116)
SCF: Error InvalidState: Worker thread CaptureScheduler frameStart failed (in src/common/Utils.cpp, function workerThread(), line 133)
SCF: Error Timeout:  (propagating from src/api/Buffer.cpp, function waitForUnlock(), line 637)
SCF: Error Timeout:  (propagating from src/components/CaptureContainerImpl.cpp, function returnBuffer(), line 358)
SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 908)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 395)
SCF: Error InvalidState:  (propagating from src/components/stages/SensorCaptureStage.cpp, function doHandleRequest(), line 87)
SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)
SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 908)
SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 395)
SCF: Error InvalidState:  (propagating from src/components/stages/SensorCaptureStage.cpp, function doHandleRequest(), line 87)
SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158

To run nvargus-daemon I am using the following procedure:

sudo su
service nvargus-daemon stop
export enableCamPclLogs=1
export enableCamScfLogs=1
export enableCamInfiniteTimeout=1
echo 1 > /sys/kernel/debug/bpmp/debug/clk/vi/mrq_rate_locked
echo 1 > /sys/kernel/debug/bpmp/debug/clk/isp/mrq_rate_locked
echo 1 > /sys/kernel/debug/bpmp/debug/clk/nvcsi/mrq_rate_locked
cat /sys/kernel/debug/bpmp/debug/clk/vi/max_rate |tee /sys/kernel/debug/bpmp/debug/clk/vi/rate
cat /sys/kernel/debug/bpmp/debug/clk/isp/max_rate | tee  /sys/kernel/debug/bpmp/debug/clk/isp/rate
cat /sys/kernel/debug/bpmp/debug/clk/nvcsi/max_rate | tee /sys/kernel/debug/bpmp/debug/clk/nvcsi/rate
/usr/sbin/nvargus-daemon

best regards,
Andres Campos
Embedded Software Engineer

Did you run gst-launch-1.0 command to launch camera?
Could you try argus_camera, also could you check if run the nvpmodel and jetson_clocks

Hello @ShaneCCC ,

Thanks again for helping me out.

I ran the tests with your suggestions. Even when deploying nvargus-daemon as I mentioned on my last comment, running sudo /usr/bin/jetson_clocks and sudo nvpmodel -m 0, I am still seeing the same behavior. With gst-launch-1.0 or with argus_camera, makes no difference.

I also ran tests with v4l2-ctl and It works as expected. I start all 4 cameras, each one on a separate console and then can kill and start again with no issues at all.

With argus_camera, I do the same. However, as soon as I kill the first camera (does not matter which, nor in which order I deploy them), nvargus-daemon crashes.

Its certainly very strange. And I don’t have a way to test the same with other sensors, since all the other sensors I have do not output 4k.

I will continue trying to debug the driver and other parts of the system, however, it is very hard without more certainty of what is it that nvargus-daemon does not like.

Please let me know if you have any other ideas or suggestions.

best regards,
Andres Campos
Embedded Software Engineer

Could you try multiple session from one argus_camera APP.
You can run below command for it.

argus_camera --module=3

Hello @ShaneCCC,

I am very sorry for the late reply.

I have performed the test you suggested. To me, when running multiple session in one argus_camera APP, the behavior seems just as undefined as before. Sometimes it closes gracefully, other times it crashes.

I also added some delays to the power off function of the sensor driver and it seemed to have helped, as the crashes are less frequent. However the delay I added is within the 200ms range and is a bit big in my opinion already to consider it as a solution, as with four cameras is almost a full second of delay and the error still shows.

Another interesting behavior I noticed is that the issue seems to happen more frequently when running the GStreamer pipeline through ssh, than when I run it using a monitor and keyboard. I am not even using X server, since I am using fakesink on my pipeline.

Please let me know if you have any other suggestions or tests I can run.

best regards,
Andres Campos
Embedded Software Engineer

Are you able to try on r32.6.1 latest release.

FYI,
according to software feature, CSI and USB Camera Features, 4K Preview at 60 FPS is validate with two cameras.

Hi @JerryChang,

Thanks for your input.

Does it mean that capturing from 4 cameras at 4k@60fps should be considered as not supported by NVIDIA?

Regards,
Marco

hello MarcoMadrigal,

please considered it as not supported since we don’t have such camera hardware to validate four 4K@60fps currently,