Hi,
This is a follow up to
I reproduced camera error with off-the-shelf LI-JETSON-IMX274-DUAL sensors attached to Orin as well as our Omnivision cameras.
For IMX274 sensors I am using driver and DTB provided for Nvidia.
The error happened approximately once per 50 hours for IMX274 - always with the second camera.
But with our Omnivision camera it happens more often: approximately once per 10 hours with one Orin/cameras set and once per 1 hour on another.
Bottom line is that camera always fails and cannot be restarted without restarting nvargus-daemon.
I tried the clock boosting suggestion from the link above, which does not make it better.
I also tried Tips for Debugging from Jetson/l4t/Camera BringUp - eLinux.org :
I enabled trace:
echo 1 > /sys/kernel/debug/tracing/tracing_on
echo 30720 > /sys/kernel/debug/tracing/buffer_size_kb
echo 1 > /sys/kernel/debug/tracing/events/tegra_rtcpu/enable
echo 1 > /sys/kernel/debug/tracing/events/freertos/enable
echo 2 > /sys/kernel/debug/camrtc/log-level
echo 1 > /sys/kernel/debug/tracing/events/camera_common/enable
echo > /sys/kernel/debug/tracing/trace
cat /sys/kernel/debug/tracing/trace
and restarted nvargus-daemon with debug log:
killall nvargus-daemon
export enableCamPclLogs=5
export enableCamScfLogs=5
/usr/sbin/nvargus-daemon
I saw different errors in the trace, such as:
rtcpu_vinotify_error: tstamp:171639231834 cch:2 vi:1 tag:CHANSEL_NOMATCH channel:0x04 frame:0 vi_tstamp:5492455394816 data:0x0000000000000249
and
rtcpu_vinotify_error: tstamp:99974825835 cch:2 vi:1 tag:CSIMUX_FRAME channel:0xac frame:56570 vi_tstamp:3199194215744 data:0x0000000000000402
and
rtcpu_vinotify_error: tstamp:65471898117 cch:2 vi:1 tag:CSIMUX_FRAME channel:0x00 frame:55908 vi_tstamp:2095100601248 data:0x0000da6700000222
and
rtcpu_vinotify_error: tstamp:65471954857 cch:-1 vi:1 tag:CSIMUX_STREAM channel:0x00 frame:0 vi_tstamp:2095102433600 data:0x0000000000000100
And different errors from nvargus-daemon:
SCF: Error BadValue: NvPHSSendThroughputHints (in src/common/CameraPowerHint.cpp, function sendCameraPowerHint(), line 56)
SCF: Error Timeout: (propagating from src/components/amr/Snapshot.cpp, function waitForNewerSample(), line 91)
SCF: Error Timeout: (propagating from src/components/ac_stages/ACSynchronizeStage.cpp, function doHandleRequest(), line 126)
SCF: Error Timeout: (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 137)
SCF: Error Timeout: Sending critical error event (in src/api/Session.cpp, function sendErrorEvent(), line 979)
SCF: Error Timeout: (propagating from src/components/CaptureContainerImpl.cpp, function assignAllBuffersFromStream(), line 241)
SCF: Error Timeout: (propagating from src/components/stages/CCDataSetupStage.cpp, function doHandleRequest(), line 68)
SCF: Error Timeout: (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)
(Argus) Error OverFlow: Too many pending events, ignoring new events (in src/api/EventProviderImpl.cpp, function addEvent(), line 158)
And various others including
nvargus-daemon[1016]: Module_id 30 Severity 2 : (fusa) Error: InvalidState Status syncpoint signaled but status value not updated in:/capture/src/fusaViHandler.cpp 817
which I mentioned in my previous port.
It looks like some kind of corruption in nvargus or real time engine, which causes random errors to pop up.
How should I troubleshoot this further?
cat /etc/nv_tegra_release
R35 (release), REVISION: 1.0, GCID: 31346300, BOARD: t186ref, EABI: aarch64, DATE: Thu Aug 25 18:41:45 UTC 2022
Thank you