Hello,
If I build a RedHawk kernel with CONFIG_LOCKDEP=y
then I see below message. I am not sure how to fix this.
However, I see from the backtrace that nvhost_module_busy()
is calling itself without, either 'up'
ing the rwsem pdata->busy_lock
or checking if anyone else has it and proceeding accordingly. I would like to know if there’s a fix to this as I am not much familiar with the driver itself so can’t make correct changes.
[ 402.467394] =============================================
[ 402.472807] [ INFO: possible recursive locking detected ]
[ 402.478221] 4.9.201-rt134-r32.5.1-tegra-RedHawk-7.5.4-debug #1 Tainted: G W
[ 402.486336] ---------------------------------------------
[ 402.491750] nvargus-daemon/4373 is trying to acquire lock:
[ 402.497249] (&pdata->busy_lock){.+.+.+}, at: [<ffffff80084f8a18>] nvhost_module_busy+0x50/0x168
[ 402.506103]
[ 402.506103] but task is already holding lock:
[ 402.511951] (&pdata->busy_lock){.+.+.+}, at: [<ffffff80084f8a18>] nvhost_module_busy+0x50/0x168
[ 402.520798]
[ 402.520798] other info that might help us debug this:
[ 402.527343] Possible unsafe locking scenario:
[ 402.527343]
[ 402.533278] CPU0
[ 402.535730] ----
[ 402.538179] lock(&pdata->busy_lock);
[ 402.541956] lock(&pdata->busy_lock);
[ 402.545733]
[ 402.545733] *** DEADLOCK ***
[ 402.545733]
[ 402.551669] May be due to missing lock nesting notation
[ 402.551669]
[ 402.558476] 1 lock held by nvargus-daemon/4373:
[ 402.563017] #0: (&pdata->busy_lock){.+.+.+}, at: [<ffffff80084f8a18>] nvhost_module_busy+0x50/0x168
[ 402.572307]
[ 402.572307] stack backtrace:
[ 402.576679] CPU: 1 PID: 4373 Comm: nvargus-daemon Tainted: G W 4.9.201-rt134-r32.5.1-tegra-RedHawk-7.5.4-debug #1
[ 402.588102] Hardware name: NVIDIA Jetson Nano Developer Kit (DT)
[ 402.594125] Call trace:
[ 402.596581] [<ffffff800808c4d8>] dump_backtrace+0x0/0x1b0
[ 402.601995] [<ffffff800808c7c4>] show_stack+0x24/0x30
[ 402.607062] [<ffffff8008423318>] dump_stack+0xa0/0xd0
[ 402.612128] [<ffffff80081223d4>] validate_chain.isra.19+0xbc4/0xc80
[ 402.618413] [<ffffff80081234d4>] __lock_acquire+0x2dc/0x758
[ 402.624000] [<ffffff8008123f20>] lock_acquire+0x108/0x260
[ 402.629415] [<ffffff8008d8e248>] down_read+0x50/0x98
[ 402.634394] [<ffffff80084f8a18>] nvhost_module_busy+0x50/0x168
[ 402.640243] [<ffffff800851eb68>] host1x_actmon_avg_norm+0x68/0xb8
[ 402.646353] [<ffffff8008520cf0>] nvhost_scale_get_dev_status+0x90/0xe0
[ 402.652901] [<ffffff8008b5a840>] devfreq_watermark_event_handler+0x158/0x4b0
[ 402.659970] [<ffffff8008b53e44>] devfreq_resume_device+0x64/0x88
[ 402.665993] [<ffffff80084f9df0>] nvhost_module_runtime_resume+0x1f0/0x288
[ 402.672803] [<ffffff8008776eb4>] pm_generic_runtime_resume+0x3c/0x58
[ 402.679177] [<ffffff80087865f0>] __genpd_runtime_resume+0x38/0xa0
[ 402.685289] [<ffffff8008788e3c>] genpd_runtime_resume+0xa4/0x218
[ 402.691312] [<ffffff800877949c>] __rpm_callback+0x74/0xa0
[ 402.696727] [<ffffff80087794fc>] rpm_callback+0x34/0x98
[ 402.701967] [<ffffff800877aa18>] rpm_resume+0x460/0x738
[ 402.707206] [<ffffff800877ad48>] __pm_runtime_resume+0x58/0xa0
[ 402.713055] [<ffffff80084f8a24>] nvhost_module_busy+0x5c/0x168
[ 402.718905] [<ffffff800850a9b4>] __nvhost_channelopen+0x13c/0x4a0
[ 402.725015] [<ffffff800850ad44>] nvhost_channelopen+0x2c/0x38
[ 402.730779] [<ffffff80082a3d94>] chrdev_open+0x9c/0x1a0
[ 402.736018] [<ffffff800829b318>] do_dentry_open+0x1d8/0x340
[ 402.741606] [<ffffff800829c7f8>] vfs_open+0x58/0x88
[ 402.746497] [<ffffff80082afa80>] do_last+0x458/0x868
[ 402.751475] [<ffffff80082aff20>] path_openat+0x90/0x378
[ 402.756714] [<ffffff80082b14a8>] do_filp_open+0x70/0xe8
[ 402.761953] [<ffffff800829ccbc>] do_sys_open+0x174/0x258
[ 402.767279] [<ffffff800829ce24>] SyS_openat+0x3c/0x50
[ 402.772345] [<ffffff80080833dc>] __sys_trace_return+0x0/0x4
Note: Having CONFIG_LOCKDEP=y
may or may not show the above warning/error. Lockdep gets automatically turned off after it finds the first locking issue, so if there are other messages then those will need to be resolved before this message is displayed.