Argus camera Frame acquire timeout occurring randomly

Hello!

I have noticed a problem which sometimes occurs, when I’m calling:
UniqueObj<Frame> frame(raw_consumer->acquireFrame(15000000000, &capture_status));
Basically sometimes I get capture_status return as 6, which means timeout. I’ve been trying to debug this but on literally all logs - following Jetson/l4t/Camera BringUp - eLinux.org , I received timeout with no more information. I am calling the function when Argus CaptureSession is idle. The worst thing is that it happens completely at random - and sometimes after that only device restart helps to fix it. I have tried acquiring low exposures, high exposure frames. I have tried calling the exact same Request for 200 times - and it failed on 80th try, being unable to recover until I restart. Here is the log from nvargus-daemon when I managed to get it stuck. How can I debug this? I am becoming desperate.

PowerServiceCore:handleRequests: timePassed = 3031
SCF: Error Timeout:  (propagating from src/services/capture/CaptureServiceEvent.cpp, function wait(), line 59)
Error: Camera HwEvents wait, this may indicate a hardware timeout occured,abort current/incoming cc
Error: waitCsiFrameEnd timeout guid 1
VI Stream Id = 4 Virtual Channel = 0
************VI Debug Registers**********
VI_CSIMUX_STAT_FRAME_16  = 0x00000000
VI_CSIMUX_FRAME_STATUS_0         = 0x00000000
VI_CFG_INTERRUPT_STATUS_0        = 0x3f000000
VI_ISPBUFA_ERROR_0       = 0x00000000
VI_FMLITE_ERROR_0        = 0x00000000
VI_NOTIFY_ERROR_0        = 0x00000000
*****************************************
CSI Stream Id = 4 Brick Id = 2
************CSI Debug Registers**********
CILA_INTR_STATUS_CILA[0x30400]   = 0x00000089
CILB_INTR_STATUS_CILB[0x30c00]   = 0x00000088
INTR_STATUS[0x300a4]     = 0x00000000
INTR_STATUS[0x300a4]     = 0x00000000
ERR_INTR_STATUS[0x300ac]         = 0x00000000
ERROR_STATUS2VI_VC0[0x30094]     = 0x00000000
ERROR_STATUS2VI_VC1[0x30098]     = 0x00000000
ERROR_STATUS2VI_VC2[0x3009c]     = 0x00000000
ERROR_STATUS2VI_VC3[0x300a0]     = 0x00000000
*****************************************
SCF: Error BadValue: timestamp cannot be 0 (in src/services/capture/NvViCsiHw.cpp, function waitCsiFrameEnd(), line 733)
SCF: Error BadValue:  (propagating from src/common/Utils.cpp, function workerThread(), line 116)
SCF: Error BadValue: Worker thread ViCsiHw frameComplete failed (in src/common/Utils.cpp, function workerThread(), line 133)
Error: waitCsiFrameStart timeout guid 1
VI Stream Id = 4 Virtual Channel = 0
************VI Debug Registers**********
VI_CSIMUX_STAT_FRAME_16  = 0x00000000
VI_CSIMUX_FRAME_STATUS_0         = 0x00000000
VI_CFG_INTERRUPT_STATUS_0        = 0x3f000000
VI_ISPBUFA_ERROR_0       = 0x00000000
VI_FMLITE_ERROR_0        = 0x00000000
VI_NOTIFY_ERROR_0        = 0x00000000
*****************************************
CSI Stream Id = 4 Brick Id = 2
************CSI Debug Registers**********
CILA_INTR_STATUS_CILA[0x30400]   = 0x00000089
CILB_INTR_STATUS_CILB[0x30c00]   = 0x00000088
INTR_STATUS[0x300a4]     = 0x00000000
INTR_STATUS[0x300a4]     = 0x00000000
ERR_INTR_STATUS[0x300ac]         = 0x00000000
ERROR_STATUS2VI_VC0[0x30094]     = 0x00000000
ERROR_STATUS2VI_VC1[0x30098]     = 0x00000000
ERROR_STATUS2VI_VC2[0x3009c]     = 0x00000000
ERROR_STATUS2VI_VC3[0x300a0]     = 0x00000000
*****************************************
SCF: Error BadValue: timestamp cannot be 0 (in src/services/capture/NvViCsiHw.cpp, function waitCsiFrameStart(), line 637)
SCF: Error BadValue:  (propagating from src/common/Utils.cpp, function workerThread(), line 116)
SCF: Error BadValue: Worker thread ViCsiHw frameStart failed (in src/common/Utils.cpp, function workerThread(), line 133)
waitForIdleLocked remaining request 214
SCF: Error Timeout: waitForIdle() timed out (in src/api/Session.cpp, function waitForIdleLocked(), line 927)
SCF: Error Timeout:  (propagating from src/api/Session.cpp, function abortCaptures(), line 893)
^[[5~^[[5~^[[5~waitForIdleLocked remaining request 214
SCF: Error Timeout: waitForIdle() timed out (in src/api/Session.cpp, function waitForIdleLocked(), line 927)
SCF: Error Timeout:  (propagating from src/api/Session.cpp, function abortCaptures(), line 893)

hello therealmatiss,

may I know which JetPack release you’re working with,
and… what’s the sensor types? is it output camera frames to CSI brick continuously?

this just looks incorrect. the stream should not have timestamps as zero.
you may probe the hardware and examine the mipi signaling.

1 Like

I am using Jetpack 4.6.1, the sensor is Sony IMX283 with my own driver. What do you mean by the sensor type?

I have probed the MIPI signal before - it looks good but I haven’t probed when this issue arises. The thing is that this happens randomly once per 50 captures or so - very unpredictable.

hello therealmatiss,

I assume you have connect this Sony IMX283 to CSI interface directly.

what’s the sensor modes you’re running? please examine by $ v4l2-ctl -d /dev/video0 --list-formats-ext.

please also refer to Approaches for Validating and Testing the V4L2 Driver, are you able to reproduce this by using V4L2 IOCTL to verify basic functionality, it helps to narrow down the issue.

Hey!

Yes, I have connected it directly. All the sensors mode do work. Mainly I am running full resolution (5496x3694) 12 bit sensor mode @ 20 fps

ioctl: VIDIOC_ENUM_FMT
        Index       : 0
        Type        : Video Capture
        Pixel Format: 'RG10'
        Name        : 10-bit Bayer RGRG/GBGB
                Size: Discrete 5496x3694
                        Interval: Discrete 0.050s (20.000 fps)
                Size: Discrete 5496x3694
                        Interval: Discrete 0.050s (20.000 fps)
                Size: Discrete 5496x3112
                        Interval: Discrete 0.040s (25.000 fps)
                Size: Discrete 3128x3046
                        Interval: Discrete 0.040s (25.000 fps)
                Size: Discrete 2796x1842
                        Interval: Discrete 0.020s (50.000 fps)
                Size: Discrete 2796x1556
                        Interval: Discrete 0.020s (50.000 fps)
                Size: Discrete 1832x1234
                        Interval: Discrete 0.050s (20.000 fps)
                Size: Discrete 4656x3694
                        Interval: Discrete 0.050s (20.000 fps)

        Index       : 1
        Type        : Video Capture
        Pixel Format: 'RG12'
        Name        : 12-bit Bayer RGRG/GBGB
                Size: Discrete 5496x3694
                        Interval: Discrete 0.050s (20.000 fps)
                Size: Discrete 5496x3694
                        Interval: Discrete 0.050s (20.000 fps)
                Size: Discrete 5496x3112
                        Interval: Discrete 0.040s (25.000 fps)
                Size: Discrete 3128x3046
                        Interval: Discrete 0.040s (25.000 fps)
                Size: Discrete 2796x1842
                        Interval: Discrete 0.020s (50.000 fps)
                Size: Discrete 2796x1556
                        Interval: Discrete 0.020s (50.000 fps)
                Size: Discrete 1832x1234
                        Interval: Discrete 0.050s (20.000 fps)
                Size: Discrete 4656x3694
                        Interval: Discrete 0.050s (20.000 fps)

By running V4L2 capture commands I am not able to reproduce this error. However, as I mentioned, the captures on Argus work 99,9% of the time but when, presumably, something goes wrong on a single frame, nvargus_daemon can’t get out of error state. Because after systemctl restart nvargus-daemon.service it starts working again.

This seems to be just like an issue I found here: question about FrameConsumer acquireFrame I do have a single camera setup though

By the way, could EGL_STREAM_MODE_FIFO and EGL_STREAM_MODE_MAILBOX have any impact on this? Also, this occurs on PIXEL_FMT_RAW16 stream, where each frame has it’s own request. I also have a “video preview” stream on the same sensor mode, which has repeat capture and it never receives invalid frame, so I kind of doubt that it’s a hardware issue (well it could be, but unlikely as preview stream has no issues with failing frames)

hello therealmatiss,

it was race condition for multi-cam use-case, and we’ve fixes check-in to r32.7.x to address the failure.
this looks more like hardware issue, since you repo’ed with single camera and the very low failure rate.

EGL stream by default as mailbox mode. however, FIFO or MAILBOX should not impact this.
one thing you may try, you may check with… EGLStream::GetBuffer() method to ensures frames are there…

in addition.
please try adding set_mode_delay_ms device tree property.
this is the setting to configure maximum wait time for the first frame after capture starts, the unit is in milliseconds.
please see-also Device Properties.
thanks

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.