Vi_capture_release causes kernel panic

Hi,
We use AGX Xavier, the l4T version is R35.5.0.
We externalized 4 CSIx4 peripherals to the Jetson.
There will be a probability of about one in a thousand that the kernel will crash.
We found the same question on Forum [V4L2] Vi capture release failed leading to a Kernel oops
In this forum you guys gave a patch to fix this. However, this patch is for R32.2.

Will R35.5.0 have this problem? Is there a patch to fix it accordingly?

Here’s a printout from the crash:
[com COM6] (2024-08-27_113217) COM6 (USB Serial Port (COM6)).log (10.5 MB)

Your help in answering this question would be greatly appreciated!

hello newbie.lei,

the known fixes should be check-in to the lates release, ideally.

may I also confirm what’s your capture pipeline?
we’ve seen some VI driver bugs on r35.5.0 recently, there’ll be kernel panic when VI trying to recover the stream.
please apply below two kernel patches in order to resolve that,
for instance,
0001-kernel-camera-remove-csi-event-sync-for-recover.patch (1.7 KB)
0002-vi5_fops-fix-mem-leak.patch (3.8 KB)

Thanks for your reply.
Our capture pipeline is:
HDMI IN → HDMI to CSI-2 Bridge → CSI → Jetson’s NVCSI → VI …
Yes, we panic at VI reset too.

We’ll punch in the patch you gave us and pressure test it again, and we’ll respond to this post if there’s any further progress.

Thank you.

hello newbie.lei,

thanks, looking forward to your results.

Unfortunately, after putting in the above two patches, we are still reproducing the problem, here is the serial port Log. any help in locating it would be greatly appreciated!

dmesg_0913.txt (1.7 MB)

hello newbie.lei,

it looks error recovery has failed, which cause a kernel panic.
could you please dig into VI driver for checking which paragraph causing failure.
for instance,

[   45.309176] tegra-camrtc-capture-vi tegra-capture-vi: UIH: CSI channel receive error. So we reset the subdev
[   45.317997] tegra-camrtc-capture-vi tegra-capture-vi: fatal: error recovery failed
[   56.515867] ------------[ cut here ]------------
[   56.516061] WARNING: CPU: 3 PID: 2198 at lib/refcount.c:25 refcount_warn_saturate+0x98/0x140
[   56.516820] ---[ end trace 994f6133d08aefd9 ]---
[   56.516952] WARNING: CPU: 3 PID: 2198 at kernel/kthread.c:83 kthread_stop+0x58/0x290
[   56.517653] ---[ end trace 994f6133d08aefda ]---
[   56.517785] Unable to handle kernel paging request at virtual address 0000ffffaacb20e0
[   56.518033] Mem abort info:
[   56.518125]   ESR = 0x96000006
[   56.518211]   EC = 0x25: DABT (current EL), IL = 32 bits
[   56.518400]   SET = 0, FnV = 0
[   56.518497]   EA = 0, S1PTW = 0
[   56.518601] Data abort info:
[   56.518698]   ISV = 0, ISS = 0x00000006
[   56.518812]   CM = 0, WnR = 0
[   56.518922] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000145c39000
[   56.519107] [0000ffffaacb20e0] pgd=000000010b564003, p4d=000000010b564003, pud=0000000145e84003, pmd=0000000000000000
[   56.519424] Internal error: Oops: 96000006 [#1] PREEMPT SMP

besides,
is it possible for moving to the latest JP-5 release version, i.e. JetPack 5.1.4
and, please apply these two patches to test again?

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.