R32.7.1 / 4.9.253-rt168 : INFO: possible recursive locking detected [ (&pdata->busy_lock) in nvhost_module_busy ]

Hello,

I am seeing the following BUG message on boot when running a preemptable kernel with LOCKDEP & etc enabled.
Could you please let me know if this is a real issue and whether there is a fix for it?

The issue appears to be what had previously been reported here but was never addressed:

Thank you

My configuration:

Jetson AGX Xavier
JetPack 4.6.1
custom built r32.7.1 with rt-patches applied (4.9.253-rt168)
CONFIG_PREEMPT_RT_FULL=y
CONFIG_DEBUG_KERNEL=y
CONFIG_LOCKDEP=y
CONFIG_PROVE_LOCKING=y
CONFIG_DEBUG_SPINLOCK=y
[   52.199171] [ INFO: possible recursive locking detected ]
[   52.199200] 4.9.253-rt168-r32.7.1-tegra-RedHawk-7.5.8-r652-rt-sign...d-fix #1 Tainted: G        W      
[   52.199211] ---------------------------------------------
[   52.199224] nvargus-daemon/4161 is trying to acquire lock:
[   52.199310]  (&pdata->busy_lock){.+.+.+}, at: [<ffffff800858f370>] nvhost_module_busy+0x50/0x168
[   52.199318] 
[   52.199318] but task is already holding lock:
[   52.199352]  (&pdata->busy_lock){.+.+.+}, at: [<ffffff800858f370>] nvhost_module_busy+0x50/0x168
[   52.199361] 
[   52.199361] other info that might help us debug this:
[   52.199389]  Possible unsafe locking scenario:
[   52.199389] 
[   52.199413]        CPU0
[   52.199417]        ----
[   52.199441]   lock(&pdata->busy_lock);
[   52.199464]   lock(&pdata->busy_lock);
[   52.199471] 
[   52.199471]  *** DEADLOCK ***
[   52.199471] 
[   52.199480]  May be due to missing lock nesting notation
[   52.199480] 
[   52.199492] 1 lock held by nvargus-daemon/4161:
[   52.199545]  #0:  (&pdata->busy_lock){.+.+.+}, at: [<ffffff800858f370>] nvhost_module_busy+0x50/0x168
[   52.199554] 
[   52.199554] stack backtrace:
[   52.199572] CPU: 1 PID: 4161 Comm: nvargus-daemon Tainted: G        W       4.9.253-rt168-r32.7.1-tegra-RedHawk-7.5.8-r652-rt-sign...d-fix #1
[   52.199573] Hardware name: Jetson-AGX (DT)
[   52.199575] Call trace:
[   52.199583] [<ffffff800808c708>] dump_backtrace+0x0/0x1b0
[   52.199587] [<ffffff800808c9f4>] show_stack+0x24/0x30
[   52.199594] [<ffffff800904fdcc>] dump_stack+0xa0/0xd0
[   52.199600] [<ffffff800814abd4>] validate_chain.isra.19+0xbc4/0xc80
[   52.199603] [<ffffff800814bcdc>] __lock_acquire+0x2dc/0x758
[   52.199606] [<ffffff800814c688>] lock_acquire+0x108/0x278
[   52.199610] [<ffffff8009058d30>] down_read+0x50/0x68
[   52.199613] [<ffffff800858f370>] nvhost_module_busy+0x50/0x168
[   52.199619] [<ffffff80085d0240>] host1x_actmon_avg_norm+0x68/0x100
[   52.199624] [<ffffff80085b80a8>] nvhost_scale_get_dev_status+0x90/0xe0
[   52.199630] [<ffffff8008dad974>] devfreq_watermark_event_handler+0x154/0x4d8
[   52.199633] [<ffffff8008da6ea0>] devfreq_resume_device+0x60/0x88
[   52.199637] [<ffffff80085907f8>] nvhost_module_runtime_resume+0x1f0/0x280
[   52.199644] [<ffffff8008861c4c>] pm_generic_runtime_resume+0x3c/0x58
[   52.199649] [<ffffff80088715d8>] __genpd_runtime_resume+0x38/0xa0
[   52.199653] [<ffffff8008873dbc>] genpd_runtime_resume+0xa4/0x210
[   52.199657] [<ffffff800886427c>] __rpm_callback+0x34/0x58
[   52.199660] [<ffffff80088642d4>] rpm_callback+0x34/0x98
[   52.199662] [<ffffff8008865854>] rpm_resume+0x494/0x770
[   52.199665] [<ffffff8008865b84>] __pm_runtime_resume+0x54/0x98
[   52.199669] [<ffffff800858f37c>] nvhost_module_busy+0x5c/0x168
[   52.199673] [<ffffff80085a1e1c>] __nvhost_channelopen.isra.2+0x13c/0x4d8
[   52.199676] [<ffffff80085a21e4>] nvhost_channelopen+0x2c/0x38
[   52.199682] [<ffffff80082be06c>] chrdev_open+0x9c/0x1a0
[   52.199686] [<ffffff80082b5338>] do_dentry_open+0x1d8/0x340
[   52.199689] [<ffffff80082b6818>] vfs_open+0x58/0x88
[   52.199692] [<ffffff80082c9dd0>] do_last+0x468/0x8a0
[   52.199695] [<ffffff80082ca298>] path_openat+0x90/0x378
[   52.199698] [<ffffff80082cb820>] do_filp_open+0x70/0xe8
[   52.199701] [<ffffff80082b6cdc>] do_sys_open+0x174/0x258
[   52.199703] [<ffffff80082b6e44>] SyS_openat+0x3c/0x50
[   52.199707] [<ffffff80080833dc>] __sys_trace_return+0x0/0x4

Hi,
We would like to know how to reproduce the error. Please help confirm the following steps:

  1. Apply rt-patches (4.9.253-rt168) to r32.7.1
  2. Enable the configs

CONFIG_PREEMPT_RT_FULL=y
CONFIG_DEBUG_KERNEL=y
CONFIG_LOCKDEP=y
CONFIG_PROVE_LOCKING=y
CONFIG_DEBUG_SPINLOCK=y

  1. Rebuild/replace kernel image
  2. After booting, the error prints are shown in dmesg

Hi,
We don’t reproduce the issue:
R32.7.1 / 4.9.253-rt168 : INFO: possible circular locking dependency detected (nvpmodel: all_q_mutex + &hp->lock) - #5 by sumitg

Would need your help to share the steps so that we can reproduce it on Xavier developer kit and check.

The issue appears to alternate between this one and another one reported here, depending on which one comes up first:

Please see the logs and the full config attached.

Thank you.

20220802-JP461-bootlog.tgz (59.4 KB)
config-4.9.253-rt168.gz (37.1 KB)