Orin Nano Devkit CAM0 interface video capture intermittently "freezes"

We have a fielded a MIPI CSI-2 compliant custom camera (8 bit grayscale) that we are trying with the Orin Nano DevKit with Jetson Linux 36.3 (JetPack 6 rev2). We are capturing video @90Hz using v4l2 interface and both CAM0 and CAM1 connectors of the devkit carrier board. However, CAM0 intermittently freezes (video capture fail) during streaming. This is happening on multiple devkits. Once the capture freezes it does not recover.
Capture was tested using:

v4l2-ctl --set-fmt-video=width=1680,height=1050,pixelformat='GREY' --stream-mmap --set-ctrl=sensor_mode=0 --stream-count=100 -d /dev/video<n>`

Trace was performed using:

modprobe rtcpu_debug

echo 1 > /sys/kernel/debug/tracing/tracing_on
echo 30720 > /sys/kernel/debug/tracing/buffer_size_kb
echo 1 > /sys/kernel/debug/tracing/events/tegra_rtcpu/enable
echo 1 > /sys/kernel/debug/tracing/events/freertos/enable
echo 3 > /sys/kernel/debug/camrtc/log-level
echo 1 > /sys/kernel/debug/tracing/events/camera_common/enable
echo > /sys/kernel/debug/tracing/trace

I am attaching the following:

dmesg log
kern.log
trace
device-tree overlay for the custom camera
kernel patches

custom_cam_kern_20240916a.log (6.6 KB)
custom_cam_trace_20240916a.txt (488.2 KB)
dmesg_20240916a.txt (62.7 KB)
device-tree.patch.txt (11.8 KB)
kernel.patch.txt (1.7 KB)
nvidia-oot.patch.txt (26.3 KB)
custom_cam_extracted_proc.dts.txt (330.0 KB)

Does the trace log only for the CAM0?
Looks like capture done well from the trace log.

Yes, I was only running on Cam0. But the trace goes bad after a few seconds. Never get “CHANSEL_PXL_EOF” and the channel gets reset by “tegra_channel_capture_setup”

     kworker/3:0-31      [003] .......  1438.633205: rtcpu_vinotify_event: tstamp:45497315265 cch:0 vi:0 tag:CHANSEL_PXL_SOF channel:0x23 frame:0 vi_tstamp:1455903330688 data:0x0000000000000001
     kworker/3:0-31      [003] .......  1438.633206: rtcpu_vinotify_event: tstamp:45497315546 cch:0 vi:0 tag:VIFALC_ACTIONLST channel:0x23 frame:0 vi_tstamp:1455903386656 data:0x0000000008020080
     kworker/3:0-31      [003] .......  1438.633206: rtcpu_vinotify_event: tstamp:45497315792 cch:0 vi:0 tag:VIFALC_TDSTATE channel:0x23 frame:0 vi_tstamp:1455903678464 data:0x799d580010000000
     kworker/3:0-31      [003] .......  1438.633207: rtcpu_vinotify_event: tstamp:45497316073 cch:0 vi:0 tag:VIFALC_TDSTATE channel:0x23 frame:0 vi_tstamp:1455903721184 data:0x0000000031000081
     kworker/3:0-31      [003] .......  1438.633210: rtcpu_vinotify_event: tstamp:45497316319 cch:0 vi:0 tag:CHANSEL_PXL_EOF channel:0x23 frame:0 vi_tstamp:1455913702080 data:0x0000000004190002
     kworker/3:0-31      [003] .......  1438.633211: rtcpu_vinotify_event: tstamp:45497316606 cch:0 vi:0 tag:FE channel:0x00 frame:0 vi_tstamp:1455913702592 data:0x0000000000000021
     kworker/3:0-31      [003] .......  1438.633212: rtcpu_vinotify_event: tstamp:45497316904 cch:0 vi:0 tag:ATOMP_FE channel:0x00 frame:0 vi_tstamp:1455913702688 data:0x0000000800000000
     kworker/3:0-31      [003] .......  1438.633212: rtcpu_vinotify_event: tstamp:45497317184 cch:0 vi:0 tag:ATOMP_FRAME_DONE channel:0x23 frame:0 vi_tstamp:1455913703232 data:0x0000000000000000
     kworker/3:0-31      [003] .......  1438.633213: rtcpu_vinotify_event: tstamp:45497317429 cch:0 vi:0 tag:VIFALC_ACTIONLST channel:0x23 frame:0 vi_tstamp:1455913755232 data:0x0000000002020080
     kworker/3:0-31      [003] .......  1438.633213: rtcpu_vinotify_event: tstamp:45497833884 cch:0 vi:0 tag:VIFALC_ACTIONLST channel:0x23 frame:0 vi_tstamp:1455913778592 data:0x000000000a020080
     kworker/3:0-31      [003] .......  1438.633214: rtcpu_vinotify_event: tstamp:45497834141 cch:0 vi:0 tag:VIFALC_ACTIONLST channel:0x23 frame:0 vi_tstamp:1455913783488 data:0x0000000000020080
     kworker/3:0-31      [003] .......  1438.633214: rtcpu_vinotify_event: tstamp:45497834426 cch:0 vi:0 tag:VIFALC_ACTIONLST channel:0x23 frame:0 vi_tstamp:1455913823520 data:0x0000000007020081
     kworker/3:0-31      [003] .......  1438.633214: rtcpu_vinotify_event: tstamp:45497834675 cch:0 vi:0 tag:FS channel:0x00 frame:0 vi_tstamp:1455914406048 data:0x0000000000000011
     kworker/3:0-31      [003] .......  1438.633215: rtcpu_vinotify_event: tstamp:45497834964 cch:0 vi:0 tag:ATOMP_FS channel:0x00 frame:0 vi_tstamp:1455914406144 data:0x0000000800000000
     kworker/3:0-31      [003] .......  1438.633215: rtcpu_vinotify_event: tstamp:45497835216 cch:0 vi:0 tag:CHANSEL_EMBED_SOF channel:0x23 frame:0 vi_tstamp:1455914425792 data:0x0000000000000004
     kworker/3:0-31      [003] .......  1438.633216: rtcpu_vinotify_event: tstamp:45497835497 cch:0 vi:0 tag:CHANSEL_EMBED_EOF channel:0x23 frame:0 vi_tstamp:1455914433184 data:0x0000000000000008
     kworker/3:0-31      [003] .......  1438.633216: rtcpu_vinotify_event: tstamp:45497835746 cch:0 vi:0 tag:ATOMP_EMB_DATA_DONE channel:0x23 frame:0 vi_tstamp:1455914434368 data:0x0000000000000000
     kworker/3:0-31      [003] .......  1438.633217: rtcpu_vinotify_event: tstamp:45497836026 cch:0 vi:0 tag:VIFALC_ACTIONLST channel:0x23 frame:0 vi_tstamp:1455914469408 data:0x0000000003020081
     kworker/3:0-31      [003] .......  1438.633217: rtcpu_vinotify_event: tstamp:45497836275 cch:0 vi:0 tag:CHANSEL_PXL_SOF channel:0x23 frame:0 vi_tstamp:1455914435616 data:0x0000000000000001**
     kworker/3:0-31      [003] .......  1438.633217: rtcpu_vinotify_event: tstamp:45497836555 cch:0 vi:0 tag:VIFALC_ACTIONLST channel:0x23 frame:0 vi_tstamp:1455914491232 data:0x0000000008020081**
     kworker/3:0-31      [003] .......  1438.633218: rtcpu_vinotify_event: tstamp:45497836804 cch:0 vi:0 tag:VIFALC_TDSTATE channel:0x23 frame:0 vi_tstamp:1455914809344 data:0x799d550010000000**
     kworker/3:0-31      [003] .......  1438.633218: rtcpu_vinotify_event: tstamp:45497837084 cch:0 vi:0 tag:VIFALC_TDSTATE channel:0x23 frame:0 vi_tstamp:1455914852064 data:0x0000000031000082**
vi-output, cust-5237    [000] .......  1441.222503: tegra_channel_capture_setup: vnc_id 0 W 1680 H 1050 fmt 5
 vi-output, cust-5236    [003] .......  1441.222729: vi_task_submit: class_id:48 ch:0 syncpt_id:22 syncpt_thresh:941 pid:5236 tid:5236
 vi-output, cust-5236    [003] .......  1441.223016: vi_task_submit: class_id:48 ch:0 syncpt_id:22 syncpt_thresh:941 pid:5236 tid:5236
 vi-output, cust-5236    [003] .......  1441.223334: vi_task_submit: class_id:48 ch:0 syncpt_id:22 syncpt_thresh:941 pid:5236 tid:5236
 vi-output, cust-5236    [003] .......  1441.223647: vi_task_submit: class_id:48 ch:0 syncpt_id:22 syncpt_thresh:941 pid:5236 tid:5236
     kworker/3:0-31      [003] .......  1441.257142: rtcpu_vinotify_error: tstamp:45579069226 cch:-1 vi:0 tag:CSIMUX_FRAME channel:0x00 frame:0 vi_tstamp:1458529349472 data:0x0000000000000121
     kworker/3:0-31      [003] .......  1441.257146: rtcpu_vinotify_event: tstamp:45579071444 cch:-1 vi:0 tag:CSIMUX_FRAME channel:0x00 frame:0 vi_tstamp:1458529349472 data:0x0000000000000121
     kworker/3:0-31      [003] .......  1441.257147: rtcpu_vinotify_event: tstamp:45579071730 cch:-1 vi:0 tag:CHANSEL_SHORT_FRAME channel:0x02 frame:0 vi_tstamp:1458529349504 data:0x0343200001000000
     kworker/3:0-31      [003] .......  1441.257147: rtcpu_vinotify_event: tstamp:45579071986 cch:-1 vi:0 tag:ATOMP_FE channel:0x00 frame:0 vi_tstamp:1458529349600 data:0x0000000800000000
     kworker/3:0-31      [003] .......  1441.257149: rtcpu_vinotify_event: tstamp:45579540270 cch:0 vi:0 tag:VIFALC_TDSTATE channel:0x23 frame:0 vi_tstamp:1458531042016 data:0x799d580010000000
     kworker/3:0-31      [003] .......  1441.257150: rtcpu_vinotify_event: tstamp:45579540532 cch:0 vi:0 tag:VIFALC_TDSTATE channel:0x23 frame:0 vi_tstamp:1458531084096 data:0x0000000031000001
     kworker/3:0-31      [003] .......  1441.257150: rtcpu_vinotify_event: tstamp:45579540820 cch:0 vi:0 tag:VIFALC_ACTIONLST channel:0x23 frame:0 vi_tstamp:1458531105792 data:0x0000000007020001
     kworker/3:0-31      [003] .......  1441.257151: rtcpu_vinotify_event: tstamp:45579541072 cch:0 vi:0 tag:VIFALC_TDSTATE channel:0x23 frame:0 vi_tstamp:1458531384192 data:0x799d550010000000
     kworker/3:0-31      [003] .......  1441.257152: rtcpu_vinotify_event: tstamp:45579541358 cch:0 vi:0 tag:VIFALC_TDSTATE channel:0x23 frame:0 vi_tstamp:1458531426912 data:0x0000000031000002

Looks like sensor didn’t send FE package cause the problem. It could be somehow sensor stop streaming.

The camera never stopped streaming (we can query an internal frame counter via i2c). Also, the stream can often be recovered by killing the capture app, waiting a short period of time and then restarting. The camera is a mature, fielded product. We tried multiple cameras on multiple Orin devkits. Any ideas on additional debugging information we can collect to find out why the capture stops?

Boost the clocks to try.

sudo su
echo 1 > /sys/kernel/debug/bpmp/debug/clk/vi/mrq_rate_locked
echo 1 > /sys/kernel/debug/bpmp/debug/clk/isp/mrq_rate_locked
echo 1 > /sys/kernel/debug/bpmp/debug/clk/nvcsi/mrq_rate_locked
echo 1 > /sys/kernel/debug/bpmp/debug/clk/emc/mrq_rate_locked
cat /sys/kernel/debug/bpmp/debug/clk/vi/max_rate |tee /sys/kernel/debug/bpmp/debug/clk/vi/rate
cat /sys/kernel/debug/bpmp/debug/clk/isp/max_rate | tee  /sys/kernel/debug/bpmp/debug/clk/isp/rate
cat /sys/kernel/debug/bpmp/debug/clk/nvcsi/max_rate | tee /sys/kernel/debug/bpmp/debug/clk/nvcsi/rate
cat /sys/kernel/debug/bpmp/debug/clk/emc/max_rate | tee /sys/kernel/debug/bpmp/debug/clk/emc/rate

Hi Shane,
Sorry for the delay. I did not see any change in behavior when I boosted the clocks. Is there more information that I could collect (maybe in NVCSI) to see why the stream suddenly stops working?

Current design we checking the trace log for debugging.

How about the MIPI timing parameters, could they be off for Camera 0 because of the lanes swaps?