When I tested the camera, there was an exception

Hi All,
When I tested the camera, there was an exception as follows. Could you please help me find out what caused it? Thank you!

root@tegra:/home/robot# more /etc/nv_tegra_release
# R35 (release), REVISION: 4.1, GCID: 33958178, BOARD: t186ref, EABI: aarch64, DATE: Tue Aug  1 19:57:35 UTC 2023

device :
orin nx

[449653.075640] ------------[ cut here ]------------
[449653.080536] WARNING: CPU: 3 PID: 2848998 at lib/refcount.c:25 refcount_warn_saturate+0xa0/0x150
[449653.089741] ---[ end trace 58dbee192ac955fc ]---
[449653.095704] WARNING: CPU: 3 PID: 2848998 at kernel/kthread.c:83 kthread_stop+0x58/0x270
[449653.104205] ---[ end trace 58dbee192ac955fd ]---
[449653.110062] Unable to handle kernel paging request at virtual address 0000ffff886be0e0
[449653.119740] Mem abort info:
[449653.122732]   ESR = 0x96000004
[449653.126304]   EC = 0x25: DABT (current EL), IL = 32 bits
[449653.132917]   SET = 0, FnV = 0
[449653.136280]   EA = 0, S1PTW = 0
[449653.139626] Data abort info:
[449653.142703]   ISV = 0, ISS = 0x00000004
[449653.146745]   CM = 0, WnR = 0
[449653.149903] user pgtable: 4k pages, 48-bit VAs, pgdp=00000001551ef000
[449653.156632] [0000ffff886be0e0] pgd=00000001225fe003, p4d=00000001225fe003, pud=000000013c414003, pmd=0000000000000000
[449653.173288] Internal error: Oops: 96000004 [#1] PREEMPT SMP
[449653.179110] Modules linked in: cdc_mbim cdc_wdm input_leds nvidia_modeset(OE) fuse mttcan can_dev can_raw can xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c xt_addrtype iptable_filter br_netfilter lzo_rle lzo_compress zram overlay ramoops reed_solomon option usb_wwan loop 8021q garp mrp wlan(OE) snd_soc_tegra186_dspk snd_soc_tegra186_asrc snd_soc_tegra210_ope snd_soc_tegra186_arad snd_soc_tegra210_iqc snd_soc_tegra210_mvc snd_soc_tegra210_afc snd_soc_tegra210_mixer snd_soc_tegra210_dmic snd_soc_tegra210_adx snd_soc_tegra210_amx snd_soc_tegra210_i2s snd_soc_tegra210_admaif aes_ce_blk snd_soc_tegra_pcm snd_soc_tegra210_sfc crypto_simd cryptd aes_ce_cipher ghash_ce sha2_ce cp210x usbserial sha256_arm64 sha1_ce snd_soc_tegra_machine_driver snd_soc_tegra210_adsp cfg80211 snd_soc_tegra_utils snd_soc_spdif_tx snd_soc_simple_card_utils pwm_fan nvgpu ina3221 r8168 snd_hda_codec_hdmi userspace_alert nvadsp
[449653.179182]  snd_hda_tegra fusb301 snd_hda_codec tegra_bpmp_thermal snd_soc_tegra210_ahub tegra210_adma nv_ar0233 snd_hda_core spi_tegra114 nvidia(OE) wlan_cnss_core_pcie(OE) binfmt_misc nvmap GobiNet(OE) ip_tables x_tables [last unloaded: mtd]
[449653.292388] CPU: 3 PID: 2848998 Comm: camera-node Tainted: G        W  OE     5.10.120-tegra #2
[449653.301385] Hardware name: Unknown NVIDIA Orin NX Developer Kit/NVIDIA Orin NX Developer Kit, BIOS 202210.3-6362cf511-dirty 12/19/2023
[449653.313863] pstate: 60400009 (nZCv daif +PAN -UAO -TCO BTYPE=--)
[449653.320117] pc : kthread_stop+0x94/0x270
[449653.324222] lr : kthread_stop+0x4c/0x270
[449653.328322] sp : ffff800031b83aa0
[449653.331800] x29: ffff800031b83aa0 x28: ffff6675072d2d00 
[449653.337334] x27: ffffab2a5c642000 x26: 0000ffffae03878c 
[449653.342859] x25: fffffffffffffdfc x24: ffff6675072d3300 
[449653.348385] x23: ffff667685bec1f8 x22: ffff66734116a0a0 
[449653.353905] x21: ffff66737db39e30 x20: 0000ffff886be0e0 
[449653.359438] x19: ffff66737db39e00 x18: 0000000000000000 
[449653.364971] x17: 0000000000000000 x16: 0000000000000000 
[449653.370496] x15: 0000ffffac65f100 x14: 0000000000000000 
[449653.376014] x13: 0000000000000000 x12: 0000000000000000 
[449653.381542] x11: 0000000000000000 x10: ffffab2a5cb6d990 
[449653.387063] x9 : ffffab2a5a8edd5c x8 : 0000000000029fb8 
[449653.392592] x7 : 00000000000030d8 x6 : c000000100000209 
[449653.398122] x5 : ffff6676ae7a2998 x4 : 0000000000000000 
[449653.403654] x3 : 0000000000000001 x2 : ffff6676ae7a29a0 
[449653.409184] x1 : 0000000000000000 x0 : 0000000000000002 
[449653.414707] Call trace:
[449653.417300]  kthread_stop+0x94/0x270
[449653.421049]  vi5_channel_stop_kthreads+0x44/0x60
[449653.425863]  vi5_channel_stop_streaming+0xe0/0x100
[449653.430859]  tegra_channel_stop_streaming+0x30/0x60
[449653.435936]  __vb2_queue_cancel+0x3c/0x220
[449653.440213]  vb2_core_queue_release+0x2c/0x60
[449653.444763]  _vb2_fop_release+0x88/0xb0
[449653.448770]  tegra_channel_close+0x64/0x190
[449653.453151]  v4l2_release+0xc4/0x100
[449653.456895]  __fput+0x80/0x240
[449653.460104]  ____fput+0x18/0x30
[449653.463407]  task_work_run+0x88/0x150
[449653.467246]  do_exit+0x354/0xab0
[449653.470639]  do_group_exit+0x4c/0xb0
[449653.474385]  get_signal+0x114/0x860
[449653.478041]  do_notify_resume+0x18c/0xa10
[449653.482229]  work_pending+0xc/0x580
[449653.485891] Code: f9432274 d503201f d503201f d2800040 (f820329f) 
[449653.492231] ---[ end trace 58dbee192ac955fe ]---
[449653.502175] Kernel panic - not syncing: Oops: Fatal exception
[449653.508151] SMP: stopping secondary CPUs
[449653.512267] Kernel Offset: 0x2b2a4a7e0000 from 0xffff800010000000
[449653.518594] PHYS_OFFSET: 0xffff998dc0000000
[449653.522970] CPU features: 0x081c0146,4a80aa38

log file:

console-ramoops-0.txt (381.7 KB)
dmesg-ramoops-0.txt (106.6 KB)
dmesg-ramoops-1.txt (106.6 KB)

Hello @wxsrite,

Just a couple of questions to better understand the situation:

  1. What camera are you using?
  2. What is the command you are running in order to capture?

regards,
Andrew

hello wxsrite,

just double confirm what’s the addition kernel patches you’ve applied now?
this should be the issue on JP-5.1.3/r35.5.0, which is under investigation.

I merged the following patch:

Synchronize the following files from jp5.1.3 to jp5.1.2.

kernel_src/kernel/nvidia/drivers/media/platform/tegra/camera/vi/vi5_fops.c
kernel_src/kernel/nvidia/include/media/mc_common.h
kernel_src/kernel/nvidia/drivers/media/platform/tegra/camera/fusa-capture/capture-vi.c

hello wxsrite,

I see… it’s not pure JP-5.1.2 kernel sources anymore.
could you please apply this kernel patch for confirmation, 0001-Revert-camera-vi5-fix-stream-on-off-memory-leakage.patch (6.0 KB)

Hi, JerryChang,
You are right, I will verify it. Because it is not necessarily present, it will take some time and repeated verification. I will inform you in time if there is any latest news.

hi JerryChang:

After adding the patch, the problem still exists, and it is the same as before.
console-ramoops-0.txt (134.7 KB)
dmesg-ramoops-0.txt (106.6 KB)
dmesg-ramoops-1.txt (106.6 KB)

vi5_fops.txt (28.4 KB)

hello wxsrite,

please apply one more kernel patch for verification,
for instance, 0001-kernel-camera-remove-csi-event-sync-for-recover.patch (1.7 KB)


BTW,
it’s uncertain what patches you’ve applied since it’s not pure JP-5.1.2 kernel sources anymore.
had you To Sync the Kernel Sources with Git?
it’s a better practice to fetch code by using Git, which keep the changes history (git log) for tracking.