CSI timeout consistently on B01 SOM

I have a Jetson Nano B01 SOM that consistently fails CSI-2 capture (libargus and v4l2). I have verified the carrier board and image sensors are OK by swapping in and capturing images with other Jetson Nano SOMs, which are flashed with the the same rootfs, and compiled with the same kernel + device tree.

The v4l2 kernel output is:

v4l2-ctl -d /dev/video0 --stream-mmap --stream-count=1 --stream-to=test.raw
Jan 28 15:58:33 localhost kernel: [   18.831295] video4linux video0: frame start syncpt timeout!0

While the nvargus-daemon timeout looks like:

Jan 28 16:00:12 localhost nvargus-daemon[5507]: Error: waitCsiFrameEnd timeout guid 2
Jan 28 16:00:12 localhost nvargus-daemon[5507]: ************VI/CSI Debug Registers**********
Jan 28 16:00:12 localhost nvargus-daemon[5507]: VI_CFG_INTERRUPT_MASK_0 = 0x00000000
Jan 28 16:00:12 localhost nvargus-daemon[5507]: VI_CFG_INTERRUPT_STATUS_0 = 0x00000000
Jan 28 16:00:12 localhost nvargus-daemon[5507]: VI_CSI_0_ERROR_STATUS_0 = 0x00000000
Jan 28 16:00:12 localhost nvargus-daemon[5507]: VI_CSI_0_ERROR_INT_MASK_0 = 0x00000000
Jan 28 16:00:12 localhost nvargus-daemon[5507]: VI_CSI_1_ERROR_STATUS_0 = 0x00000000
Jan 28 16:00:12 localhost nvargus-daemon[5507]: VI_CSI_1_ERROR_INT_MASK_0 = 0x00000000
Jan 28 16:00:12 localhost nvargus-daemon[5507]: CSI_CSI_PIXEL_PARSER_A_INTERRUPT_MASK_0 = 0x00000000
Jan 28 16:00:12 localhost nvargus-daemon[5507]: CSI_CSI_PIXEL_PARSER_A_STATUS_0 = 0x00000000
Jan 28 16:00:12 localhost nvargus-daemon[5507]: CSI_CSI_PIXEL_PARSER_B_INTERRUPT_MASK_0 = 0x00000000
Jan 28 16:00:12 localhost nvargus-daemon[5507]: CSI_CSI_PIXEL_PARSER_B_STATUS_0 = 0x00000000
Jan 28 16:00:12 localhost nvargus-daemon[5507]: CSI_CSI_CIL_A_INTERRUPT_MASK_0 = 0x00000000
Jan 28 16:00:12 localhost nvargus-daemon[5507]: CSI_CSI_CIL_A_STATUS_0 = 0x00000000
Jan 28 16:00:12 localhost nvargus-daemon[5507]: CSI_CSI_CILA_STATUS_0 = 0x00000000
Jan 28 16:00:12 localhost nvargus-daemon[5507]: CSI_CSI_CIL_B_INTERRUPT_MASK_0 = 0x00000000
Jan 28 16:00:12 localhost nvargus-daemon[5507]: CSI_CSI_CIL_B_STATUS_0 = 0x00000000
Jan 28 16:00:12 localhost nvargus-daemon[5507]: CSI_CSI_CILB_STATUS_0 = 0x00000000
Jan 28 16:00:12 localhost nvargus-daemon[5507]: CSI_CSI_READONLY_STATUS_0 = 0x00000000
Jan 28 16:00:12 localhost nvargus-daemon[5507]: CSI_DEBUG_CONTROL_0 = 0x276c7200
Jan 28 16:00:12 localhost nvargus-daemon[5507]: CSI_DEBUG_COUNTER_0_0 = 0x00000000
Jan 28 16:00:12 localhost nvargus-daemon[5507]: CSI_DEBUG_COUNTER_1_0 = 0x00000000
Jan 28 16:00:12 localhost nvargus-daemon[5507]: CSI_DEBUG_COUNTER_2_0 = 0x00000000
Jan 28 16:00:12 localhost nvargus-daemon[5507]: *****************************************
Jan 28 16:00:12 localhost nvargus-daemon[5507]: Error: waitCsiFrameEnd Something went wrong with waiting on csi frame end
Jan 28 16:00:13 localhost kernel: [  118.263051] fence timeout on [ffffffc071516540] after 1500ms
Jan 28 16:00:13 localhost kernel: [  118.263094] fence timeout on [ffffffc0715163c0] after 1500ms
Jan 28 16:00:13 localhost kernel: [  118.263171] name=[nvhost_sync:24], current value=0 waiting value=1
Jan 28 16:00:13 localhost kernel: [  118.263202] ---- mlocks ----
Jan 28 16:00:13 localhost kernel: [  118.263240]
Jan 28 16:00:13 localhost kernel: [  118.263256] ---- syncpts ----
Jan 28 16:00:13 localhost kernel: [  118.263302] id 1 (disp1_a) min 1 max 1 refs 1 (previous client : )
Jan 28 16:00:13 localhost kernel: [  118.263334] id 2 (disp1_b) min 1 max 1 refs 1 (previous client : )
Jan 28 16:00:13 localhost kernel: [  118.263364] id 3 (disp1_c) min 1 max 1 refs 1 (previous client : )
Jan 28 16:00:13 localhost kernel: [  118.263430] id 15 (gm20b_507) min 686 max 686 refs 1 (previous client : )
Jan 28 16:00:13 localhost kernel: [  118.263463] id 16 (gm20b_506) min 4 max 4 refs 1 (previous client : )
Jan 28 16:00:13 localhost kernel: [  118.263496] id 17 (gm20b_505) min 2 max 2 refs 1 (previous client : )
Jan 28 16:00:13 localhost kernel: [  118.263527] id 18 (gm20b_504) min 2 max 2 refs 1 (previous client : )
Jan 28 16:00:13 localhost kernel: [  118.263557] id 19 (gm20b_503) min 4 max 4 refs 1 (previous client : )
Jan 28 16:00:13 localhost kernel: [  118.263595] id 22 (54080000.vi_2) min 2 max 7 refs 7 (previous client : 54080000.vi_2)
Jan 28 16:00:13 localhost kernel: [  118.263631] id 24 (54080000.vi_3) min 0 max 3 refs 4 (previous client : 54080000.vi_3)
Jan 28 16:00:13 localhost kernel: [  118.263664] id 25 (54080000.vi_4) min 0 max 3 refs 4 (previous client : 54080000.vi_4)
Jan 28 16:00:13 localhost kernel: [  118.263707] id 30 (gm20b_502) min 8 max 8 refs 1 (previous client : )
Jan 28 16:00:13 localhost kernel: [  118.263738] id 31 (gm20b_501) min 6 max 6 refs 1 (previous client : )
Jan 28 16:00:13 localhost kernel: [  118.263769] id 32 (gm20b_500) min 6 max 6 refs 1 (previous client : )

Is there any possibility this is related to e.g. marginal timing settings managed by software/device tree? Given that every other SOM I’ve tested (maybe 6 other samples) is OK is it safe to conclude this SOM is defective, and can/should we initiate an RMA with NVIDIA or with the supplier (Arrow)?

Thanks

Please git it to RMA.

Thanks

Got it, thanks, can you confirm that the RMA should go through Arrow Electronics?

Yes, please contact with the distributor first. Thanks