Lockdep - possible recursive locking - nvhost_module_busy

Hello,

If I build a RedHawk kernel with CONFIG_LOCKDEP=y then I see below message. I am not sure how to fix this.

However, I see from the backtrace that nvhost_module_busy() is calling itself without, either 'up'ing the rwsem pdata->busy_lock or checking if anyone else has it and proceeding accordingly. I would like to know if there’s a fix to this as I am not much familiar with the driver itself so can’t make correct changes.

[  402.467394] =============================================
[  402.472807] [ INFO: possible recursive locking detected ]
[  402.478221] 4.9.201-rt134-r32.5.1-tegra-RedHawk-7.5.4-debug #1 Tainted: G        W      
[  402.486336] ---------------------------------------------
[  402.491750] nvargus-daemon/4373 is trying to acquire lock:
[  402.497249]  (&pdata->busy_lock){.+.+.+}, at: [<ffffff80084f8a18>] nvhost_module_busy+0x50/0x168
[  402.506103] 
[  402.506103] but task is already holding lock:
[  402.511951]  (&pdata->busy_lock){.+.+.+}, at: [<ffffff80084f8a18>] nvhost_module_busy+0x50/0x168
[  402.520798] 
[  402.520798] other info that might help us debug this:
[  402.527343]  Possible unsafe locking scenario:
[  402.527343] 
[  402.533278]        CPU0
[  402.535730]        ----
[  402.538179]   lock(&pdata->busy_lock);
[  402.541956]   lock(&pdata->busy_lock);
[  402.545733] 
[  402.545733]  *** DEADLOCK ***
[  402.545733] 
[  402.551669]  May be due to missing lock nesting notation
[  402.551669] 
[  402.558476] 1 lock held by nvargus-daemon/4373:
[  402.563017]  #0:  (&pdata->busy_lock){.+.+.+}, at: [<ffffff80084f8a18>] nvhost_module_busy+0x50/0x168
[  402.572307] 
[  402.572307] stack backtrace:
[  402.576679] CPU: 1 PID: 4373 Comm: nvargus-daemon Tainted: G        W       4.9.201-rt134-r32.5.1-tegra-RedHawk-7.5.4-debug #1
[  402.588102] Hardware name: NVIDIA Jetson Nano Developer Kit (DT)
[  402.594125] Call trace:
[  402.596581] [<ffffff800808c4d8>] dump_backtrace+0x0/0x1b0
[  402.601995] [<ffffff800808c7c4>] show_stack+0x24/0x30
[  402.607062] [<ffffff8008423318>] dump_stack+0xa0/0xd0
[  402.612128] [<ffffff80081223d4>] validate_chain.isra.19+0xbc4/0xc80
[  402.618413] [<ffffff80081234d4>] __lock_acquire+0x2dc/0x758
[  402.624000] [<ffffff8008123f20>] lock_acquire+0x108/0x260
[  402.629415] [<ffffff8008d8e248>] down_read+0x50/0x98
[  402.634394] [<ffffff80084f8a18>] nvhost_module_busy+0x50/0x168
[  402.640243] [<ffffff800851eb68>] host1x_actmon_avg_norm+0x68/0xb8
[  402.646353] [<ffffff8008520cf0>] nvhost_scale_get_dev_status+0x90/0xe0
[  402.652901] [<ffffff8008b5a840>] devfreq_watermark_event_handler+0x158/0x4b0
[  402.659970] [<ffffff8008b53e44>] devfreq_resume_device+0x64/0x88
[  402.665993] [<ffffff80084f9df0>] nvhost_module_runtime_resume+0x1f0/0x288
[  402.672803] [<ffffff8008776eb4>] pm_generic_runtime_resume+0x3c/0x58
[  402.679177] [<ffffff80087865f0>] __genpd_runtime_resume+0x38/0xa0
[  402.685289] [<ffffff8008788e3c>] genpd_runtime_resume+0xa4/0x218
[  402.691312] [<ffffff800877949c>] __rpm_callback+0x74/0xa0
[  402.696727] [<ffffff80087794fc>] rpm_callback+0x34/0x98
[  402.701967] [<ffffff800877aa18>] rpm_resume+0x460/0x738
[  402.707206] [<ffffff800877ad48>] __pm_runtime_resume+0x58/0xa0
[  402.713055] [<ffffff80084f8a24>] nvhost_module_busy+0x5c/0x168
[  402.718905] [<ffffff800850a9b4>] __nvhost_channelopen+0x13c/0x4a0
[  402.725015] [<ffffff800850ad44>] nvhost_channelopen+0x2c/0x38
[  402.730779] [<ffffff80082a3d94>] chrdev_open+0x9c/0x1a0
[  402.736018] [<ffffff800829b318>] do_dentry_open+0x1d8/0x340
[  402.741606] [<ffffff800829c7f8>] vfs_open+0x58/0x88
[  402.746497] [<ffffff80082afa80>] do_last+0x458/0x868
[  402.751475] [<ffffff80082aff20>] path_openat+0x90/0x378
[  402.756714] [<ffffff80082b14a8>] do_filp_open+0x70/0xe8
[  402.761953] [<ffffff800829ccbc>] do_sys_open+0x174/0x258
[  402.767279] [<ffffff800829ce24>] SyS_openat+0x3c/0x50
[  402.772345] [<ffffff80080833dc>] __sys_trace_return+0x0/0x4

Note: Having CONFIG_LOCKDEP=y may or may not show the above warning/error. Lockdep gets automatically turned off after it finds the first locking issue, so if there are other messages then those will need to be resolved before this message is displayed.

Looks like the similar topic with below topic?

Hello ShaneCCC,

No, this is a different type of lockdep warning. Here it is complaining about a possible A->A deadlock whereas in the other topic (linked), lockdep is complaining about a “non-static key” which usually means that a corresponding spinlock is not registered or similar cause.

Also, even though this lockdep is in the same subsystem (nvhost) the path seems to be different.