Vi_capture_request_unpin cause kernel panic

Hi,
We use Jetson AGX Xavier. The L4T version is R35.6.0.
We designed the carrier board ourselves, and we externally connected four 4-lane sensors to it.

Occasionally, when we streamoff the sensor, the following error occurs:

<3>[52327.552631] tegra194-vi5 15c10000.vi: unexpected response from camera processor
<3>[52327.552863] tegra194-vi5 15c10000.vi: vi_capture_release: release channel IVC failed

Then trigger WARN_ON("RTCPU is in a bad state. Reboot to recover");

<6>[52327.553130] ------------[ cut here ]------------
<4>[52327.553148] WARNING: CPU: 4 PID: 570525 at /home/gitlab-runner/builds/x2eF_KiB/0/csw/jetson/l4t/r3560/kernel/nvidia/drivers/media/platform/tegra/camera/fusa-capture/capture-vi.c:974 vi_capture_release+0x2b4/0x2e0
<6>[52327.553642] Modules linked in: bnep cfg80211 xt_nat iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c iptable_filter ramoops reed_solomon uvcvideo videobuf2_vmalloc snd_soc_tegra210_iqc snd_soc_tegra186_dspk snd_soc_tegra186_asrc snd_soc_tegra210_ope snd_soc_tegra210_mvc snd_soc_tegra186_arad snd_soc_tegra210_dmic snd_soc_tegra210_afc snd_soc_tegra210_adx snd_soc_tegra210_admaif snd_soc_tegra210_amx snd_soc_tegra210_mixer snd_soc_tegra210_i2s snd_soc_tegra_pcm snd_soc_tegra210_sfc aes_ce_blk crypto_simd cryptd aes_ce_cipher ghash_ce sha2_ce sha256_arm64 sha1_ce leds_gpio snd_soc_spdif_tx snd_soc_tegra_machine_driver uih_fpga_codec uih_gpio max77620_thermal input_leds loop snd_soc_tegra210_adsp snd_soc_tegra_utils snd_soc_simple_card_utils snd_soc_tegra210_ahub nct1008 nvadsp tegra210_adma snd_hda_codec_hdmi tegra_bpmp_thermal userspace_alert snd_hda_tegra snd_hda_codec snd_hda_core snd_soc_rt5640 snd_soc_rl6231 at24 lt6911uxe spi_tegra114 binfmt_misc ina3221 nvgpu
<6>[52327.553820]  nvmap ip_tables x_tables [last unloaded: mtd]
<6>[52327.553838] CPU: 4 PID: 570525 Comm: kworker/4:1 Not tainted 5.10.216-tegra #1
<6>[52327.553841] Hardware name: NVIDIA Jetson-AGX/Jetson, BIOS 202210.5-c101ba51-dirty 05/20/2025
<6>[52327.553854] Workqueue: events delayed_fput
<6>[52327.553865] pstate: 60c00009 (nZCv daif +PAN +UAO -TCO BTYPE=--)
<6>[52327.553871] pc : vi_capture_release+0x2b4/0x2e0
<6>[52327.553877] lr : vi_capture_release+0x2b4/0x2e0
<6>[52327.553880] sp : ffff800052d1bb60
<6>[52327.553893] x29: ffff800052d1bb60 x28: ffffbbbef1fe6000 
<6>[52327.553909] x27: ffff8000592ebca8 x26: ffff27bcdfdfa248 
<6>[52327.553923] x25: 0000000000000000 x24: ffff27bb3df6d880 
<6>[52327.553937] x23: 00000000ffffffea x22: ffff27bb8644f200 
<6>[52327.553951] x21: 000000000000ffff x20: 0000000000000001 
<6>[52327.553965] x19: ffff27bb8267f800 x18: 0000000000000010 
<6>[52327.553979] x17: 0000000000000000 x16: ffffbbbef01b5090 
<6>[52327.553993] x15: ffff27bb3df6ddf0 x14: 633531203569762d 
<6>[52327.554031] x13: 3439316172676574 x12: 7061635f6976203a 
<6>[52327.554067] x11: 69762e3030303031 x10: 6572203a65736165 
<6>[52327.554077] x9 : 6c65725f65727574 x8 : 5649206c656e6e61 
<6>[52327.554085] x7 : 686320657361656c x6 : c0000000ffff7fff 
<6>[52327.554094] x5 : ffff27c17fdf4970 x4 : ffffbbbef2007a48 
<6>[52327.554103] x3 : 0000000000000001 x2 : 0000000000000000 
<6>[52327.554149] x1 : 0000000000000000 x0 : 0000000000000000 
<6>[52327.554162] Call trace:
<6>[52327.554170]  vi_capture_release+0x2b4/0x2e0
<6>[52327.554176]  vi_capture_shutdown+0x6c/0x130
<6>[52327.554181]  vi_channel_close_ex+0x30/0x90
<6>[52327.554201]  vi_channel_release+0x30/0x40
<6>[52327.554207]  __fput+0x80/0x260
<6>[52327.554211]  delayed_fput+0x4c/0x60
<6>[52327.554219]  process_one_work+0x1c4/0x4c0
<6>[52327.554225]  worker_thread+0x54/0x450
<6>[52327.554230]  kthread+0x148/0x170
<6>[52327.554237]  ret_from_fork+0x10/0x18
<4>[52327.554242] ---[ end trace ab516625b3231e09 ]---

After the above anomaly occurs, streamon the sensor will trigger a kernel panic:

<6>[52330.127609] [RCE] ERROR: camera-ip/vi5/vi5.c:3893 [vi5_update_channel_match_data] "match configuration is already in use by by channel 0 cannot set channel match configuration"
<6>[52330.127616] [RCE] ERROR: services/capture/capture-scheduler.c:3024 [capture_scheduler_thread] "Error processing event 0x100003"
<3>[52330.235603] tegra194-vi5 15c10000.vi: capture control message timed out
<3>[52330.235770] tegra194-vi5 15c10000.vi: vi_capture_release: release channel IVC failed

<1>[52330.322198] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
<1>[52330.322424] Mem abort info:
<1>[52330.322519]   ESR = 0x96000006
<1>[52330.322607]   EC = 0x25: DABT (current EL), IL = 32 bits
<1>[52330.322743]   SET = 0, FnV = 0
<1>[52330.322829]   EA = 0, S1PTW = 0
<1>[52330.322905] Data abort info:
<1>[52330.323004]   ISV = 0, ISS = 0x00000006
<1>[52330.326668]   CM = 0, WnR = 0
<1>[52330.329854] user pgtable: 4k pages, 48-bit VAs, pgdp=000000023f400000
<1>[52330.336347] [0000000000000000] pgd=00000001fa2ba003, p4d=00000001fa2ba003, pud=0000000282868003, pmd=0000000000000000
<0>[52330.347100] Internal error: Oops: 0000000096000006 [#1] PREEMPT SMP
<6>[52330.353242] Modules linked in: bnep cfg80211 xt_nat iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c iptable_filter ramoops reed_solomon uvcvideo videobuf2_vmalloc snd_soc_tegra210_iqc snd_soc_tegra186_dspk snd_soc_tegra186_asrc snd_soc_tegra210_ope snd_soc_tegra210_mvc snd_soc_tegra186_arad snd_soc_tegra210_dmic snd_soc_tegra210_afc snd_soc_tegra210_adx snd_soc_tegra210_admaif snd_soc_tegra210_amx snd_soc_tegra210_mixer snd_soc_tegra210_i2s snd_soc_tegra_pcm snd_soc_tegra210_sfc aes_ce_blk crypto_simd cryptd aes_ce_cipher ghash_ce sha2_ce sha256_arm64 sha1_ce leds_gpio snd_soc_spdif_tx snd_soc_tegra_machine_driver uih_fpga_codec uih_gpio max77620_thermal input_leds loop snd_soc_tegra210_adsp snd_soc_tegra_utils snd_soc_simple_card_utils snd_soc_tegra210_ahub nct1008 nvadsp tegra210_adma snd_hda_codec_hdmi tegra_bpmp_thermal userspace_alert snd_hda_tegra snd_hda_codec snd_hda_core snd_soc_rt5640 snd_soc_rl6231 at24 lt6911uxe spi_tegra114 binfmt_misc ina3221 nvgpu
<6>[52330.353471]  nvmap ip_tables x_tables [last unloaded: mtd]
<6>[52330.445647] CPU: 1 PID: 577512 Comm: vd_v4l2_event Tainted: G        W         5.10.216-tegra #1
<6>[52330.454566] Hardware name: NVIDIA Jetson-AGX/Jetson, BIOS 202210.5-c101ba51-dirty 05/20/2025
<6>[52330.462984] pstate: 60400009 (nZCv daif +PAN -UAO -TCO BTYPE=--)
<6>[52330.469288] pc : vi_capture_request_unpin+0x44/0xd0
<6>[52330.474278] lr : vi_capture_request_unpin+0x34/0xd0
<6>[52330.479246] sp : ffff80001eeb3cb0
<6>[52330.482654] x29: ffff80001eeb3cb0 x28: ffff27ba14179d80 
<6>[52330.488167] x27: 0000000000000000 x26: 0000000000000000 
<6>[52330.493677] x25: 0000000000000000 x24: 0000000000000000 
<6>[52330.499189] x23: ffff27ba0011aca0 x22: ffff27ba0c6fa688 
<6>[52330.504704] x21: ffff27ba0c6fa400 x20: 0000000000000000 
<6>[52330.510213] x19: 0000000000000000 x18: 0000000000000000 
<6>[52330.515470] x17: 0000000000000000 x16: ffffbbbef01b5090 
<6>[52330.521240] x15: ffff27ba1417a2f0 x14: ffffffffffffffff 
<6>[52330.526405] x13: ffffbbbef262adf0 x12: ffffbbbef262aa20 
<6>[52330.531746] x11: 0000000000000040 x10: 0000000000000ab0 
<6>[52330.537517] x9 : ffff80001eeb3b40 x8 : 206c656e6e616863 
<6>[52330.542941] x7 : 206c6f72746e6f63 x6 : c0000000ffff7fff 
<6>[52330.548381] x5 : ffff27c17fdd4970 x4 : ffffbbbef2007a48 
<6>[52330.553534] x3 : 0000000000000000 x2 : ffff27ba14179d80 
<6>[52330.558868] x1 : 0000000000000000 x0 : 0000000000000000 
<6>[52330.564207] Call trace:
<6>[52330.566670]  vi_capture_request_unpin+0x44/0xd0
<6>[52330.571211]  vi_capture_shutdown+0x8c/0x130
<6>[52330.575668]  vi_channel_close_ex+0x30/0x90
<6>[52330.579692]  vi_channel_release+0x30/0x40
<6>[52330.583461]  __fput+0x80/0x260
<6>[52330.586694]  ____fput+0x24/0x30
<6>[52330.589847]  task_work_run+0x88/0xe0
<6>[52330.593088]  do_notify_resume+0x24c/0x990
<6>[52330.597110]  work_pending+0xc/0x738
<0>[52330.600795] Code: 52801901 f94156a0 9ba17e61 8b010014 (b8616800) 
<4>[52330.606918] ---[ end trace ab516625b3231e0d ]---
<3>[52330.619269] tegra194-vi5 15c10000.vi: failed to update control callback
<3>[52330.619623] tegra-camrtc-capture-vi tegra-capture-vi: vi capture setup failed
<3>[52330.625248] tegra-camrtc-capture-vi tegra-capture-vi: fatal: error recovery failed
<0>[52330.626542] Kernel panic - not syncing: Oops: Fatal exception
<2>[52330.638237] SMP: stopping secondary CPUs
<0>[52330.642181] Kernel Offset: 0x3bbee0190000 from 0xffff800010000000
<0>[52330.648377] PHYS_OFFSET: 0xffffd84700000000
<0>[52330.652061] CPU features: 0x48240002,03802a30
<0>[52330.656535] Memory Limit: none

The function vi_capture_request_unpin triggered a null pointer.
Below is the complete log:
dmesg-ramoops-1.txt (106.6 KB)

Could you please take a look at how to solve this problem?

hello newbie.lei,

it looks there’s request timed out after you stream-off, and it’s attempting to reset the capture channel repeatedly.
may I know what’s your stream-off steps?
did you terminate v4l2 capture process, or, you’ve disabled the sensor’s output stream?

let’s give it a try with this kernel patch, 0001-Camera-fix-kernel-warning-after-VI-timeout.patch (1.4 KB)
it’ll prevent kernel warning when there’s no frame, VI return timeout status.

Hello,
Our method for stopping streaming may seem a bit unusual.
Our sensor supports hot-plugging, so it may suddenly interrupt the stream and send an interrupt to the driver. We then proceed to STREAMOFF.
Alright, we’ll try to apply the patch and get back to you with the results.

hello newbie.lei,

FYI,
that’s not camera software stack expected, when you’ve suddenly interrupt the stream, it’ll trigger error handling mechanism.

yes, when there is no frame after 25000ms, the capture channel will reset.

<3>[52327.543748] tegra-camrtc-capture-vi tegra-capture-vi: uncorr_err: request timed out after 2500 ms
<4>[52327.543999] tegra-camrtc-capture-vi tegra-capture-vi: err_rec: attempting to reset the capture channel
<4>[52327.544702] tegra-camrtc-capture-vi tegra-capture-vi: err_rec: successfully reset the capture channel

At the same time, the driver will also receive an interrupt, and the application layer will call STREAMOFF to stop capture.
However, when calling vi_capture_release(), a kernel panic occurred. Could this be due to improper handling of race conditions, causing vi_capture_release() to release resources and subsequently access a null pointer?
How should we properly implement V4L2’s STREAMOFF and close() functions?

I assume previous provided patch (in post #3) should avoid to access a null pointer.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.