Streaming stops randomly using v4l2 camera

Hello,
We are running a pipeline that involves 2 camera streams going together into a stream mix for inference.
While this pipeline worked fine with older jetpack (4.5), it seems to randomly stuck with the jetpack 4.5.1.
We are using The duffel yocto branch (32.5.1).
Camera is from Econ (ar1335).

Looking at the pipeline, when it doesn’t work (50% occurrence), we see no data coming out of the v4l2src at all.

The pipeline looks like this:
1st cameras → tee0
tee0 → mux0 → infer → fakesink
2nd camera → tee1
tee1 → mux0 → infer → fakesink
tee0 → rtspsink
tee1 → rtspsink

If you run this for one camera only the issue rarely happens. Same applies to if you run this with no inference which leads me to suspect this is a clock issue.

When it stop (pipe stuck waiting for new data), I see the following:

       <...>-95390 [000] .... 82642.874927: rtcpu_vinotify_error: tstamp:2582914930583 tag:CHANSEL_NOMATCH channel:0x01 frame:1216 vi_tstamp:2582912158223 data:0x000003c9
       <...>-95390 [000] .... 82642.874930: rtcpu_vinotify_event: tstamp:2582914930885 tag:CSIMUX_FRAME channel:0x00 frame:1216 vi_tstamp:2582912157318 data:0x000000a0
       <...>-95390 [000] .... 82642.874931: rtcpu_vinotify_event: tstamp:2582914931049 tag:CHANSEL_SHORT_FRAME channel:0x01 frame:1216 vi_tstamp:2582912157318 data:0x01000000
       <...>-95390 [000] .... 82642.874933: rtcpu_vinotify_event: tstamp:2582914931194 tag:RESERVED_19 channel:0x22 frame:192 vi_tstamp:2582113742368 data:0x01020001
       <...>-95390 [000] .... 82642.874933: rtcpu_vinotify_event: tstamp:2582914931353 tag:FS channel:0x00 frame:0 vi_tstamp:2582912157318 data:0x00000010
       <...>-95390 [000] .... 82642.874934: rtcpu_vinotify_event: tstamp:2582914931493 tag:ATOMP_FE channel:0x00 frame:1216 vi_tstamp:2582912157319 data:0x00000000
       <...>-95390 [000] .... 82642.874934: rtcpu_vinotify_event: tstamp:2582914931653 tag:ATOMP_FS channel:0x00 frame:0 vi_tstamp:2582912157319 data:0x00000000
       <...>-95390 [000] .... 82642.874935: rtcpu_vinotify_event: tstamp:2582914931790 tag:CSIMUX_FRAME channel:0x00 frame:0 vi_tstamp:2582912157333 data:0x000000a0
       <...>-95390 [000] .... 82642.874935: rtcpu_vinotify_event: tstamp:2582914931954 tag:CHANSEL_SHORT_FRAME channel:0x01 frame:0 vi_tstamp:2582912157333 data:0x01000000
       <...>-95390 [000] .... 82642.874936: rtcpu_vinotify_event: tstamp:2582914932089 tag:RESERVED_19 channel:0x22 frame:0 vi_tstamp:2582113766176 data:0x01020001
       <...>-95390 [000] .... 82642.874937: rtcpu_vinotify_event: tstamp:2582914932244 tag:FS channel:0x00 frame:1216 vi_tstamp:2582912157333 data:0x00000010
       <...>-95390 [000] .... 82642.874937: rtcpu_vinotify_event: tstamp:2582914932382 tag:ATOMP_FE channel:0x00 frame:0 vi_tstamp:2582912157334 data:0x00000000
       <...>-95390 [000] .... 82642.874938: rtcpu_vinotify_event: tstamp:2582914932541 tag:CHANSEL_NOMATCH channel:0x01 frame:1216 vi_tstamp:2582912158223 data:0x000003c9
       <...>-95390 [000] .... 82642.874938: rtcpu_vinotify_event: tstamp:2582914932677 tag:CHANSEL_PXL_EOF channel:0x23 frame:0 vi_tstamp:2582912302166 data:0x0c2f0002
       <...>-95390 [000] .... 82642.874939: rtcpu_vinotify_event: tstamp:2582914932832 tag:FE channel:0x00 frame:0 vi_tstamp:2582912302181 data:0x00000022
       <...>-95390 [000] .... 82642.874939: rtcpu_vinotify_event: tstamp:2582914932969 tag:ATOMP_FE channel:0x00 frame:0 vi_tstamp:2582912302181 data:0x00000000
       <...>-95390 [000] .... 82642.874940: rtcpu_vinotify_event: tstamp:2582914933126 tag:ATOMP_FRAME_DONE channel:0x23 frame:0 vi_tstamp:2582912302187 data:0x00000000
       <...>-95390 [000] .... 82642.874941: rtcpu_vinotify_event: tstamp:2582914933259 tag:RESERVED_19 channel:0x23 frame:0 vi_tstamp:2582118376800 data:0x02020001
       <...>-95390 [000] .... 82642.874941: rtcpu_vinotify_event: tstamp:2582914933418 tag:RESERVED_19 channel:0x23 frame:0 vi_tstamp:2582118380736 data:0x00020001
       <...>-95390 [000] .... 82642.874942: rtcpu_vinotify_event: tstamp:2582914933555 tag:RESERVED_19 channel:0x23 frame:0 vi_tstamp:2582118387808 data:0x07020002
       <...>-95390 [000] .... 82642.874942: rtcpu_vinotify_event: tstamp:2582914933712 tag:FS channel:0x00 frame:0 vi_tstamp:2582912654928 data:0x00000012
       <...>-95390 [000] .... 82642.874943: rtcpu_vinotify_event: tstamp:2582914933848 tag:ATOMP_FS channel:0x00 frame:0 vi_tstamp:2582912654929 data:0x00000000
       <...>-95390 [000] .... 82642.874944: rtcpu_vinotify_event: tstamp:2582914934005 tag:CHANSEL_PXL_SOF channel:0x23 frame:0 vi_tstamp:2582912654944 data:0x00000001

Means we see CHANSEL_NOMATCH with 0x3c9 (not sure how this decodes.

I also see this error repeating all over when stream runs:

<…>-95390 [000] … 82642.704157: rtcpu_nvcsi_intr: tstamp:2582910466265 class:GLOBAL type:PHY_INTR0 phy:0 cil:0 st:0 vc:0 status:0x00000010
<…>-95390 [000] … 82642.704157: rtcpu_nvcsi_intr: tstamp:2582910466265 class:CORRECTABLE_ERR type:STREAM_VC phy:0 cil:0 st:0 vc:0 status:0x00000004
<…>-95390 [000] … 82642.704158: rtcpu_nvcsi_intr: tstamp:2582910466265 class:CORRECTABLE_ERR type:PHY_INTR phy:0 cil:0 st:0 vc:0 status:0x00000110
<…>-95390 [000] … 82642.704158: rtcpu_nvcsi_intr: tstamp:2582910467013 class:GLOBAL type:STREAM_VC phy:0 cil:0 st:0 vc:0 status:0x00000004
<…>-95390 [000] … 82642.704158: rtcpu_nvcsi_intr: tstamp:2582910467013 class:GLOBAL type:PHY_INTR0 phy:0 cil:0 st:0 vc:0 status:0x00000110
<…>-95390 [000] … 82642.704159: rtcpu_nvcsi_intr: tstamp:2582910467013 class:CORRECTABLE_ERR type:PHY_INTR phy:0 cil:0 st:0 vc:0 status:0x00000100

vi and nvcsi clocks are. at its max values:
cat /sys/kernel/debug/bpmp/debug/clk/vi/rate
460800000

cat /sys/kernel/debug/bpmp/debug/clk/nvcsi/rate
314000000

Any clue what am I missing?

One thing I forgot to mention is we see two processes hogging cpu:
97611 root 20 0 0 0 0 R 100.0 0.0 2:20.99 vi-output, ar13
97613 root 20 0 0 0 0 R 94.1 0.0 2:20.54 vi-output, ar13

One per camera so something is wrong picking up frames from the csi mipi…

hello tirank,

could you please narrow down this, can you try Applications Using V4L2 IOCTL Directly to test the stream stability.

Thanks Jerry,
Already done this, see below:

v4l2-ctl --set-fmt-video=width=4192,height=3120,pixelformat=UYVY --stream-mmap --stream-count=1000 -d /dev/video0 --stream-to=/tmp/ov5693.raw
<<<<<<<<<<<< 10.00 fps
<<<<<<<<<< 10.00 fps
<<<<<<<<<< 10.00 fps
<<<<<<<<<< 10.00 fps
<<<<<<<<<< 10.00 fps
<<<<<<<<<< 10.00 fps
<<<<<<<<<< 10.00 fps
<<<<<<<<<< 10.00 fps
<<<<<<<<<< 10.00 fps
<<<<<<<<<< 10.00 fps
<<<<<<<<<< 10.00 fps
<<<<<<<<<< 10.00 fps
<<<<<<<<<< 10.00 fps
<<<<<<<<<< 10.00 fps
<<<<<<<<<< 10.00 fps
<<<18179072 != 26158080
<4096 != 26158080
<4096 != 26158080
<4096 != 26158080
<4096 != 26158080
<4096 != 26158080
<4096 != 26158080
< 10.00 fps
4096 != 26158080
<4096 != 26158080
<4096 != 26158080
<4096 != 26158080
<4096 != 26158080
<4096 != 26158080
<4096 != 26158080
<4096 != 26158080
<4096 != 26158080
<4096 != 26158080
< 10.00 fps
4096 != 26158080
<4096 != 26158080
<4096 != 26158080
<4096 != 26158080
<4096 != 26158080
<4096 != 26158080
<4096 != 26158080
<4096 != 26158080
<4096 != 26158080
<4096 != 26158080
< 10.00 fps

And this shows repeatedly on the csi logs:
<…>-95968 [000] … 84479.976119: rtcpu_nvcsi_intr: tstamp:2640324856890 class:GLOBAL type:PHY_INTR0 phy:0 cil:0 st:0 vc:0 status:0x00000110
<…>-95968 [000] … 84479.976119: rtcpu_nvcsi_intr: tstamp:2640324856890 class:GLOBAL type:PHY_INTR0 phy:1 cil:0 st:0 vc:0 status:0x00000110
<…>-95968 [000] … 84479.976119: rtcpu_nvcsi_intr: tstamp:2640324856890 class:CORRECTABLE_ERR type:PHY_INTR phy:0 cil:0 st:0 vc:0 status:0x00000110

It looks to start well but very quickly its not…
This is not an HW issue as it happens on every device I tried that also worked fine when we used jetpack4.4.

there shows PHY interrupt, the status 0x110 indicate there data-lane [A/B] FIFO overflow. you may also contact with sensor vendor to examine the signaling.

Thanks again Jerry.
To try amd recap, before i jump on to the sensor driver, can you please explain this:

  1. 0x110 look like the result, not the cause, because it happens after we are not receiving frames anymore. Can you please explain the 0x3c9 status provided earlier in the discussion? Can you please provide registers breakdown so i can conclude myself?
  2. Since we work with v4l2 camera that has its own isp (we dont use nvargus), this means raw packets are streamed via regular mem (not nvmem). Is there a limit as to how much data can be streamed this way? Any clocks beside emc, vi and csi involved here?
  3. Is there any limit on the resolution? Maybe on the line itself (width)?

Thanks