Kernel crash during the stopping of Libargus based camera driver

Hi Forum,

We are using Libargus API to acquire the image frames of dual RAW Bayer cameras (inspired from the stereoSync sample). However, when stopping our application, we are facing the following crash :

[ 2412.924470] WARNING: CPU: 1 PID: 32576 at kernel/module.c:1162 module_put+0x18c/0x1a0
[ 2412.932523] Modules linked in: veth nvidia_modeset(OE) fuse xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c xt_addrtype iptable_filter br_netfilter lzo_rle lzo_compress zram overlay ramoops reed_solomon loop bnep iwlmvm mac80211 aes_ce_blk crypto_simd cryptd aes_ce_cipher ghash_ce sha2_ce iwlwifi sha256_arm64 inv_icm42600_spi btusb sha1_ce input_leds btrtl btbcm nvgpu btintel pwm_fan inv_icm42600_i2c r8168 cfg80211 inv_icm42600 nct1008 kfifo_buf nv_imx296 max96793 max96792 tegra_bpmp_thermal ina3221 spi_tegra114 nvidia(OE) binfmt_misc nvmap ip_tables x_tables [last unloaded: mtd]
[ 2412.932590] CPU: 1 PID: 32576 Comm: CaptureSchedule Tainted: G           OE     5.10.120-tegra #2
[ 2412.932591] Hardware name: Unknown NVIDIA Orin NX Developer Kit/NVIDIA Orin NX Developer Kit, BIOS 4.1-33958178 08/01/2023
[ 2412.932594] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO BTYPE=--)
[ 2412.932597] pc : module_put+0x18c/0x1a0
[ 2412.932598] lr : module_put+0x20/0x1a0
[ 2412.932599] sp : ffff8000320cb990
[ 2412.932601] x29: ffff8000320cb990 x28: ffff0fa393b35600 
[ 2412.932603] x27: ffff0fa38c0fd0b8 x26: 0000000000000000 
[ 2412.932606] x25: ffffc68553268000 x24: ffff0fa3c1274158 
[ 2412.932609] x23: ffff0fa3c1275160 x22: ffff0fa385f8f020 
[ 2412.932611] x21: ffff0fa3ab60ba80 x20: ffffc685b69c2248 
[ 2412.932614] x19: ffffc68553268300 x18: 0000000000000000 
[ 2412.932616] x17: 0000000000000000 x16: 0000000000000000 
[ 2412.932619] x15: 0000ffff8000131c x14: 0000000000000580 
[ 2412.932621] x13: 0000000000000285 x12: 0000000000000024 
[ 2412.932624] x11: 071c71c71c71c71c x10: 0000000000000ab0 
[ 2412.932626] x9 : ffff8000320cb7c0 x8 : ffff0fa390dd54d0 
[ 2412.932629] x7 : 0000000000000826 x6 : 00000000000003e8 
[ 2412.932631] x5 : 00000000410fd420 x4 : 0000000000000000 
[ 2412.932633] x3 : ffffc68553268640 x2 : 00000000ffffffff 
[ 2412.932636] x1 : ffff0fa390dd49c0 x0 : 0000000000000001 
[ 2412.932639] Call trace:
[ 2412.932641]  module_put+0x18c/0x1a0
[ 2412.932645]  v4l2sd_stream+0xb8/0x230
[ 2412.932648]  tegra_channel_set_stream+0x3d4/0x580
[ 2412.932651]  vi5_channel_stop_streaming+0x34/0xe0
[ 2412.932653]  tegra_channel_stop_streaming+0x3c/0x70
[ 2412.932656]  __vb2_queue_cancel+0x40/0x220
[ 2412.932658]  vb2_core_streamoff+0x34/0xd0
[ 2412.932659]  vb2_streamoff+0x34/0x80
[ 2412.932661]  vb2_ioctl_streamoff+0x58/0x70
[ 2412.932663]  v4l_streamoff+0x40/0x50
[ 2412.932665]  __video_do_ioctl+0x188/0x400
[ 2412.932667]  video_usercopy+0x280/0x790
[ 2412.932669]  video_ioctl2+0x40/0x180
[ 2412.932670]  v4l2_ioctl+0x68/0x90
[ 2412.932674]  __arm64_sys_ioctl+0xac/0xf0
[ 2412.932677]  el0_svc_common.constprop.0+0x80/0x1d0
[ 2412.932679]  do_el0_svc+0x38/0xb0
[ 2412.932682]  el0_svc+0x1c/0x30
[ 2412.932684]  el0_sync_handler+0xa8/0xb0
[ 2412.932686]  el0_sync+0x16c/0x180
[ 2412.932687] ---[ end trace 782af91b8b725d5b ]---

Could you advice to make the shutdown more robust, please ? We are using Jetpack-5.12.

Thanks in advance and best regards,
Khang

hello khang.l4es,

we found there’s an issue when VI trying to recover camera stream.
could you please refer to Topic 305007 to apply two kernel patches for verification.

Hello @JerryChang,

I applied the patches in Topic 305007 and there’s no more crash. However, on one of the setups, I saw below RCE error quite often (I did not see it before applying the patches):

[ 1375.899859] [RCE] ERROR: camera-ip/vi5/vi5.c:745 [vi5_handle_eof] "General error queue is out of sync with frame queue. ts=1222981193280 sof_ts=1396904249632 gerror_code=2 gerror_data=222 notify_bits=20010"

Stopping and re-launching the reading of the cameras is still possible.

What could be potential cause please?

Best Regards,
Khang
Best Regards,
Khang

hello khang.l4es,

this also one known issue with JP-5.1.2, honestly, there’re several camera software bugs within early JP-5 release version.
please refer to Topic 309780 for rce-fw update to resolve that failure.

anyways,
it’s suggest moving to the latest JP-5 (i.e. JetPack 5.1.4), which has include several camera software bug fixes.
you may see-also Jetson_Linux_Release_Notes_r35.6.0 for more details.

Hi @JerryChang ,

Thank you for confirm. I think that I will update the rce-fw first as all other setups are aligned with Jetpack-5.1.2 and it will take time to migrate to.

Best Regards,
Khang

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.