GPU driver crash

I’m using R21.5 release and I encountered this gpu crash more than once. The system becomes more or less unresponsive and needs to be rebooted:

[ 1672.628648] gk20a gk20a.0: gk20a_pmu_enable_elpg: gk20a_pmu_enable_elpg(): possible elpg refcnt mismatch. elpg refcnt=2
[ 1672.628659] ------------[ cut here ]------------
[ 1672.628672] WARNING: at drivers/gpu/nvgpu/gk20a/pmu_gk20a.c:3156 gk20a_pmu_enable_elpg+0x1d8/0x2b8()
[ 1672.628677] Modules linked in: nvhost_vi
[ 1672.628692] CPU: 1 PID: 7458 Comm: kworker/1:1 Not tainted 3.10.40-hhone-1.3.0 #24
[ 1672.628702] Workqueue: events pmu_setup_hw
[ 1672.628724] [<c0016370>] (unwind_backtrace+0x0/0x13c) from [<c0012c0c>] (show_stack+0x18/0x1c)
[ 1672.628735] [<c0012c0c>] (show_stack+0x18/0x1c) from [<c0062714>] (warn_slowpath_common+0x5c/0x74)
[ 1672.628746] [<c0062714>] (warn_slowpath_common+0x5c/0x74) from [<c00627e0>] (warn_slowpath_null+0x24/0x2c)
[ 1672.628756] [<c00627e0>] (warn_slowpath_null+0x24/0x2c) from [<c03fe3b8>] (gk20a_pmu_enable_elpg+0x1d8/0x2b8)
[ 1672.628770] [<c03fe3b8>] (gk20a_pmu_enable_elpg+0x1d8/0x2b8) from [<c03fe920>] (pmu_setup_hw+0x488/0x890)
[ 1672.628780] [<c03fe920>] (pmu_setup_hw+0x488/0x890) from [<c0081c88>] (process_one_work+0x13c/0x444)
[ 1672.628791] [<c0081c88>] (process_one_work+0x13c/0x444) from [<c00829c0>] (worker_thread+0x140/0x3dc)
[ 1672.628805] [<c00829c0>] (worker_thread+0x140/0x3dc) from [<c0088ee0>] (kthread+0xd4/0xd8)
[ 1672.628816] [<c0088ee0>] (kthread+0xd4/0xd8) from [<c000ed98>] (ret_from_fork+0x14/0x20)
[ 1672.628825] ---[ end trace 40af022902fb81c6 ]---
[ 1672.687667] gk20a gk20a.0: gk20a_pmu_disable_elpg: gk20a_pmu_disable_elpg(): possible elpg refcnt mismatch. elpg refcnt=1
[ 1672.687680] ------------[ cut here ]------------
[ 1672.687700] WARNING: at drivers/gpu/nvgpu/gk20a/pmu_gk20a.c:3192 gk20a_pmu_disable_elpg+0x80/0x2e0()
[ 1672.687708] Modules linked in: nvhost_vi
[ 1672.687730] CPU: 1 PID: 12624 Comm: gstglcontext Tainted: G        W    3.10.40-hhone-1.3.0 #24
[ 1672.687752] [<c0016370>] (unwind_backtrace+0x0/0x13c) from [<c0012c0c>] (show_stack+0x18/0x1c)
[ 1672.687765] [<c0012c0c>] (show_stack+0x18/0x1c) from [<c0062714>] (warn_slowpath_common+0x5c/0x74)
[ 1672.687773] [<c0062714>] (warn_slowpath_common+0x5c/0x74) from [<c00627e0>] (warn_slowpath_null+0x24/0x2c)
[ 1672.687782] [<c00627e0>] (warn_slowpath_null+0x24/0x2c) from [<c03feda8>] (gk20a_pmu_disable_elpg+0x80/0x2e0)
[ 1672.687795] [<c03feda8>] (gk20a_pmu_disable_elpg+0x80/0x2e0) from [<c03e45c0>] (gk20a_alloc_obj_ctx+0x8c8/0xbd4)
[ 1672.687805] [<c03e45c0>] (gk20a_alloc_obj_ctx+0x8c8/0xbd4) from [<c03d3e68>] (gk20a_channel_ioctl+0x6a8/0x10cc)
[ 1672.687819] [<c03d3e68>] (gk20a_channel_ioctl+0x6a8/0x10cc) from [<c01588b8>] (do_vfs_ioctl+0x3f8/0x5b8)
[ 1672.687829] [<c01588b8>] (do_vfs_ioctl+0x3f8/0x5b8) from [<c0158ad0>] (SyS_ioctl+0x58/0x168)
[ 1672.687839] [<c0158ad0>] (SyS_ioctl+0x58/0x168) from [<c000ed00>] (ret_fast_syscall+0x0/0x30)
[ 1672.687845] ---[ end trace 40af022902fb81c7 ]---

Can you help me understand where the problem is?

Thank you, best regards.


Does this show all “ok”?

sha1sum -c /etc/nv_tegra_release

When does this occur, e.g., has it been running something for awhile, or does this occur upon boot completing, is there a particular program triggering this, so on?

Could you try this workaround?

sudo echo 0 > /sys/devices/platform/host1x/gk20a.0/elpg_enable

Hi linuxdev and WayneWWW,

yes, the sha1sum is ok. It seems that the issue is triggered by an application that uses gstreamer C APIs with omxh264enc and omxh264dec, but I’m not sure of this, because it happens only sometimes and I don’t have steps to reproduce it.

Thank you WayneWWW, I will try this. What is the impact of disabling the elpg?

Thank you, best regards.


Hi IvanGolob,

Have you tested the workaround? Any abnormal or crash found?


Hi kayccc,

after I set “elpg_enable” to 0, I have not seen the gpu crash anymore.

Thank you, best regards.


Hi kayccc, WayneWWW, or IvanGolob,

I have been seeing a gpu crash similar to the above. I have also tried the patch here ( but that did not do anything. Disabling elpg seems to work.

But I need to know what the implications are of disabling elpg…I tried to look around at documentation but I was not able to find what exactly elpg is for.

If I disable it, it does not crash anymore but what are the consequences of disabling it?