Orin vpi kernel panic

我们在使用vpi 接口编程的时候,每次运行程序都会因为pva 错误导致kernel panic ,demo代码上传到附件,同时log如下:
utocity@autocity-Desktop:~$ uptime

14:18:53 up 3 min, 4 users, load average: 4.40, 2.35, 0.94

autocity@autocity-Desktop:~$

autocity@autocity-Desktop:~$

autocity@autocity-Desktop:~$

autocity@autocity-Desktop:~$ [ 3766.763680] falcon 154c0000.nvenc: Direct firmware load for nvhost_nvenc080.fw failed with error -2
[ 3766.763945] falcon 154c0000.nvenc: Falling back to sysfs fallback for: nvhost_nvenc080.fw
[ 3766.765907] falcon 154c0000.nvenc: looking for firmware in subdirectory
[ 6643.829716] main_vpi: page allocation failure: order:9, mode:0x40dc0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO), nodemask=(null),cpuset=/,mems_allowed=0
[ 6643.830248] pva 16000000.pva0: nvpva_queue_task_pool_alloc: failed to allocate task_pool->kmem_addr
[ 6643.830524] BUG: Bad page state in process main_vpi pfn:186400
[ 6643.830702] page:00000000d75c3f45 refcount:0 mapcount:0 mapping:0000000000000000 index:0xffff68200 pfn:0x186400
[ 6643.830992] head:00000000d75c3f45 order:9 compound_mapcount:1 compound_pincount:0
[ 6643.831199] flags: 0x8000000000090014(uptodate|lru|head|swapbacked)
[ 6643.831373] raw: 8000000000090014 fffffff610f32648 fffffff610f32688 0000000000000000
[ 6643.831589] raw: 0000000ffff68200 0000000000000000 00000000ffffffff ffff7d8099f8f000
[ 6643.831795] page dumped because: page still charged to cgroup
[ 6643.831944] page->mem_cgroup:ffff7d8099f8f000
[ 6643.832836] Disabling lock debugging due to kernel taint
[ 6643.833392] BUG: Bad page state in process main_vpi pfn:186600
[ 6643.834267] page:00000000ae0a13a6 refcount:0 mapcount:0 mapping:0000000000000000 index:0xffff6ec00 pfn:0x186600
[ 6643.837417] head:00000000ae0a13a6 order:9 compound_mapcount:1 compound_pincount:0
[ 6643.844999] flags: 0x8000000000090014(uptodate|lru|head|swapbacked)
[ 6643.851132] raw: 8000000000090014 fffffff5fe0e8008 fffffff5fdfa0008 0000000000000000
[ 6643.859050] raw: 0000000ffff6ec00 0000000000000000 00000000ffffffff ffff7d83bb83f000
[ 6643.866774] page dumped because: page still charged to cgroup
[ 6643.872459] page->mem_cgroup:ffff7d83bb83f000
[ 6643.890631] pva 16000000.pva0: nvpva_queue_task_pool_alloc: failed to allocate task_pool->kmem_addr
[ 6649.253253] BUG: Bad rss-counter state mm:000000005eedb3ed type:MM_ANONPAGES val:512
[ 6649.253496] BUG: Bad rss-counter state mm:000000005eedb3ed type:MM_SHMEMPAGES val:-512
[ 6649.253741] BUG: non-zero pgtables_bytes on freeing mm: 4096
[ 7427.483814] BUG: Bad rss-counter state mm:0000000075f60f3d type:MM_ANONPAGES val:512
[ 7427.484135] BUG: Bad rss-counter state mm:0000000075f60f3d type:MM_SHMEMPAGES val:-512
[ 7427.484416] BUG: non-zero pgtables_bytes on freeing mm: 4096
[70738.983850] Unsafe core_pattern used with fs.suid_dumpable=2.
[70738.983850] Pipe handler or fully qualified core dump path required.
[70738.983850] Set kernel.core_pattern before fs.suid_dumpable.
[71181.010571] warn_alloc: 1 callbacks suppressed
[71181.010576] main_vpi: page allocation failure: order:9, mode:0x40dc0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO), nodemask=(null),cpuset=/,mems_allowed=0
[71181.011249] pva 16000000.pva0: nvpva_queue_task_pool_alloc: failed to allocate task_pool->kmem_addr
[71181.011521] kernel BUG at mm/slub.c:4118!
[71181.011635] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[71181.011806] Modules linked in: input_leds(E) fuse(E) nvidia_modeset(OE) spidev(E) option(E) cdc_mbim(E) usb_wwan(E) cdc_wdm(E) usbserial(E) lzo_rle(E) lzo_compress(E) zram(E) overlay(E) ramoops(E) reed_solomon(E) nv_ox3cc(E) nv_ht_imx390(E) loop(E) can_raw(E) can(E) nvgpu(E) snd_soc_tegra186_asrc(E) aes_ce_blk(E) crypto_simd(E) snd_soc_tegra186_dspk(E) snd_soc_tegra210_ope(E) snd_soc_tegra210_mvc(E) snd_soc_tegra210_iqc(E) snd_soc_tegra186_arad(E) cryptd(E) snd_soc_tegra210_dmic(E) snd_soc_tegra210_afc(E) snd_soc_tegra210_adx(E) snd_soc_tegra210_amx(E) aes_ce_cipher(E) mcp25xxfd(E) snd_soc_tegra210_admaif(E) ghash_ce(E) snd_soc_tegra210_mixer(E) snd_soc_tegra210_i2s(E) snd_soc_tegra210_sfc(E) sha2_ce(E) 88x2cs(E) snd_soc_tegra_pcm(E) binfmt_misc(E) snd_soc_tegra210_adsp(E) sha256_arm64(E) snd_hda_codec_hdmi(E) ucsi_ccg(E) mttcan(E) sha1_ce(E) snd_soc_tegra_machine_driver(E) cfg80211(E) snd_hda_tegra(E) typec_ucsi(E) snd_soc_tegra_utils(E) pps_gpio(E) snd_soc_simple_card_utils(E) pwm_fan(E)
[71181.011868] snd_soc_spdif_tx(E) nvadsp(E) typec(E) snd_hda_codec(E) can_dev(E) nct1008(E) snd_soc_tegra210_ahub(E) ina3221(E) tegra_bpmp_thermal(E) snd_hda_core(E) snd_soc_rt5640(E) userspace_alert(E) max96712(E) tegra210_adma(E) snd_soc_rl6231(E) spi_tegra114(E) nvidia(OE) nvmap(E) ip_tables(E) x_tables(E) [last unloaded: mtd]
[71181.072207] CPU: 2 PID: 635388 Comm: main_vpi Tainted: G B W OE 5.10.104-tegra #17
[71181.080866] Hardware name: Unknown Jetson AGX Orin/Jetson AGX Orin, BIOS 3.1-32827747 03/19/2023
[71181.089793] pstate: 40400009 (nZcv daif +PAN -UAO -TCO BTYPE=–)
[71181.095836] pc : kfree+0x3ec/0x470
[71181.099242] lr : kfree+0x24/0x470
[71181.102655] sp : ffff80003a47b960
[71181.106067] x29: ffff80003a47b960 x28: ffff7d7f07072000
[71181.111404] x27: ffff7d7f06d2c820 x26: ffffdb5c8e853b28
[71181.116916] x25: 0080000000000000 x24: ffffdb5c8e7f7000
[71181.122429] x23: ffffdb5c8cf8214c x22: ffff7d7f7b400000
[71181.127942] x21: ffff7d82f87be580 x20: 0000000000000000
[71181.133453] x19: fffffff5fdcd0000 x18: 0000000000000010
[71181.138879] x17: 0000000000000000 x16: ffffdb5c8c852d90
[71181.144391] x15: ffff7d82f87beaf0 x14: 745f65756575715f
[71181.149730] x13: 617670766e203a30 x12: 203a636f6c6c615f
[71181.155243] x11: 6c6f6f705f6b7361 x10: 61636f6c6c61206f
[71181.160667] x9 : 742064656c696166 x8 : 6176702e30303030
[71181.166179] x7 : 3030363120617670 x6 : c00000010000102b
[71181.171604] x5 : ffff7d8e2ea9c958 x4 : ffffdb5c8e507968
[71181.177029] x3 : 0000000000000001 x2 : ffffdb5c8c9e7460
[71181.182367] x1 : fffffff5fdc917c8 x0 : fffffff5fdc917c8
[71181.187704] Call trace:
[71181.190159] kfree+0x3ec/0x470
[71181.193308] nvpva_queue_alloc+0x40c/0x420
[71181.197330] pva_open+0x60/0x110
[71181.200570] chrdev_open+0xa8/0x1a0
[71181.204244] do_dentry_open+0x130/0x3a0
[71181.208005] vfs_open+0x38/0x50
[71181.210979] path_openat+0x848/0xdd0
[71181.214655] do_filp_open+0x84/0x110
[71181.218154] do_sys_openat2+0x1f4/0x2b0
[71181.221918] do_sys_open+0x7c/0xd0
[71181.225329] __arm64_sys_openat+0x2c/0x40
[71181.229357] el0_svc_common.constprop.0+0x7c/0x1c0
[71181.234166] do_el0_svc+0x34/0xa0
[71181.237406] el0_svc+0x1c/0x30
[71181.240556] el0_sync_handler+0xa8/0xb0
[71181.244316] el0_sync+0x16c/0x180
[71181.247555] Code: 17ffff7a f9400660 3707fba0 a90573fb (d4210000)
[71181.253688] —[ end trace 5005693122f63716 ]—
[71181.262727] Kernel panic - not syncing: Oops - BUG: Fatal exception
[71181.264357] SMP: stopping secondary CPUs
[71181.268124] Kernel Offset: 0x5b5c7c830000 from 0xffff800010000000
[71181.274154] PHYS_OFFSET: 0xffff828200000000
[71181.278356] CPU features: 0x0040006,4a80aa38
[71181.282732] Memory Limit: none
[71181.290361] —[ end Kernel panic - not syncing: Oops - BUG: Fatal exception ]—

[0000.061] I> MB1 (version: 0.32.0.1-t234-54845784-1cb23efd)
[0000.067] I> t234-A01-0-Silicon (0x12347) Prod
2023-10.zip (4.4 KB)

Hi,

Thanks for sharing the reproducible code.
Which JetPack do you use? Is it JetPack 5.1.2?

Thanks.

使用的是 Jetpack 5.1.1 [L4T 35.3.1]

Hi,

Would you mind checking if the same issue also occurs on our latest software (JetPack 5.1.2)?
Thanks.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.