Hi,
I’m using R21.5 release and I encountered this gpu crash more than once. The system becomes more or less unresponsive and needs to be rebooted:
[ 1672.628648] gk20a gk20a.0: gk20a_pmu_enable_elpg: gk20a_pmu_enable_elpg(): possible elpg refcnt mismatch. elpg refcnt=2
[ 1672.628659] ------------[ cut here ]------------
[ 1672.628672] WARNING: at drivers/gpu/nvgpu/gk20a/pmu_gk20a.c:3156 gk20a_pmu_enable_elpg+0x1d8/0x2b8()
[ 1672.628677] Modules linked in: nvhost_vi
[ 1672.628692] CPU: 1 PID: 7458 Comm: kworker/1:1 Not tainted 3.10.40-hhone-1.3.0 #24
[ 1672.628702] Workqueue: events pmu_setup_hw
[ 1672.628724] [<c0016370>] (unwind_backtrace+0x0/0x13c) from [<c0012c0c>] (show_stack+0x18/0x1c)
[ 1672.628735] [<c0012c0c>] (show_stack+0x18/0x1c) from [<c0062714>] (warn_slowpath_common+0x5c/0x74)
[ 1672.628746] [<c0062714>] (warn_slowpath_common+0x5c/0x74) from [<c00627e0>] (warn_slowpath_null+0x24/0x2c)
[ 1672.628756] [<c00627e0>] (warn_slowpath_null+0x24/0x2c) from [<c03fe3b8>] (gk20a_pmu_enable_elpg+0x1d8/0x2b8)
[ 1672.628770] [<c03fe3b8>] (gk20a_pmu_enable_elpg+0x1d8/0x2b8) from [<c03fe920>] (pmu_setup_hw+0x488/0x890)
[ 1672.628780] [<c03fe920>] (pmu_setup_hw+0x488/0x890) from [<c0081c88>] (process_one_work+0x13c/0x444)
[ 1672.628791] [<c0081c88>] (process_one_work+0x13c/0x444) from [<c00829c0>] (worker_thread+0x140/0x3dc)
[ 1672.628805] [<c00829c0>] (worker_thread+0x140/0x3dc) from [<c0088ee0>] (kthread+0xd4/0xd8)
[ 1672.628816] [<c0088ee0>] (kthread+0xd4/0xd8) from [<c000ed98>] (ret_from_fork+0x14/0x20)
[ 1672.628825] ---[ end trace 40af022902fb81c6 ]---
[ 1672.687667] gk20a gk20a.0: gk20a_pmu_disable_elpg: gk20a_pmu_disable_elpg(): possible elpg refcnt mismatch. elpg refcnt=1
[ 1672.687680] ------------[ cut here ]------------
[ 1672.687700] WARNING: at drivers/gpu/nvgpu/gk20a/pmu_gk20a.c:3192 gk20a_pmu_disable_elpg+0x80/0x2e0()
[ 1672.687708] Modules linked in: nvhost_vi
[ 1672.687730] CPU: 1 PID: 12624 Comm: gstglcontext Tainted: G W 3.10.40-hhone-1.3.0 #24
[ 1672.687752] [<c0016370>] (unwind_backtrace+0x0/0x13c) from [<c0012c0c>] (show_stack+0x18/0x1c)
[ 1672.687765] [<c0012c0c>] (show_stack+0x18/0x1c) from [<c0062714>] (warn_slowpath_common+0x5c/0x74)
[ 1672.687773] [<c0062714>] (warn_slowpath_common+0x5c/0x74) from [<c00627e0>] (warn_slowpath_null+0x24/0x2c)
[ 1672.687782] [<c00627e0>] (warn_slowpath_null+0x24/0x2c) from [<c03feda8>] (gk20a_pmu_disable_elpg+0x80/0x2e0)
[ 1672.687795] [<c03feda8>] (gk20a_pmu_disable_elpg+0x80/0x2e0) from [<c03e45c0>] (gk20a_alloc_obj_ctx+0x8c8/0xbd4)
[ 1672.687805] [<c03e45c0>] (gk20a_alloc_obj_ctx+0x8c8/0xbd4) from [<c03d3e68>] (gk20a_channel_ioctl+0x6a8/0x10cc)
[ 1672.687819] [<c03d3e68>] (gk20a_channel_ioctl+0x6a8/0x10cc) from [<c01588b8>] (do_vfs_ioctl+0x3f8/0x5b8)
[ 1672.687829] [<c01588b8>] (do_vfs_ioctl+0x3f8/0x5b8) from [<c0158ad0>] (SyS_ioctl+0x58/0x168)
[ 1672.687839] [<c0158ad0>] (SyS_ioctl+0x58/0x168) from [<c000ed00>] (ret_fast_syscall+0x0/0x30)
[ 1672.687845] ---[ end trace 40af022902fb81c7 ]---
Can you help me understand where the problem is?
Thank you, best regards.
Ivan