Dota2 crashes driver at the end of a match on GeForce GTX 670MX

Difficult to reproduce, but it happened at least 5 times in the past month: at the end of a game (usually after a few matches) after pressing “continue”, the screen fades to black and becomse frozen, except the dota2 cursor is still visible and moves around fine, but keyboard input is so laggy nothing happens. The screen is basically stuck.

The CPU is pegged at 200% for the dota2 process, the lxdm (display manager) process, and the Xorg server process. Which is probably why keyboard inputs are not doing anything.

After hitting ctrl+alt+f2 (to get to a tty) many times for a minute or two, I can finally get to kill the dota2 process. Otherwise, the window manager just doesn’t respond (ie. can’t switch workspaces).

In system logs I noticed these lines around the time it crashed:

Aug 27 19:25:03 clevo kernel: NVRM: Xid (PCI:0000:01:00): 13, Graphics Exception on GPC 0: WIDTH CT Violation. Coordinates: (0x120, 0x0)
Aug 27 19:25:03 clevo kernel: NVRM: Xid (PCI:0000:01:00): 13, Graphics Exception: ESR 0x500420=0x80000010 0x500434=0x120 0x500438=0x1900 0x50043c=0x0
Aug 27 19:25:03 clevo kernel: NVRM: Xid (PCI:0000:01:00): 13, Graphics Exception on GPC 1: WIDTH CT Violation. Coordinates: (0x160, 0x0)
Aug 27 19:25:03 clevo kernel: NVRM: Xid (PCI:0000:01:00): 13, Graphics Exception: ESR 0x508420=0x80000010 0x508434=0x160 0x508438=0x1900 0x50843c=0x0
Aug 27 19:25:03 clevo kernel: NVRM: Xid (PCI:0000:01:00): 13, Graphics Exception on GPC 2: WIDTH CT Violation. Coordinates: (0x130, 0x0)
Aug 27 19:25:03 clevo kernel: NVRM: Xid (PCI:0000:01:00): 13, Graphics Exception: ESR 0x510420=0x80000010 0x510434=0x130 0x510438=0x1900 0x51043c=0x0
Aug 27 19:25:03 clevo kernel: NVRM: Xid (PCI:0000:01:00): 13, Graphics Exception: ChID 002c, Class 0000a097, Offset 0000081c, Data 006e0000

So apparently the nvidia driver is having a problem, thus I have not reported this to the dota2 team yet.

If you look at my attached logs, you’ll see I also hit sysRq key combinations but nothing happened.

I think I had the same problem with the very latest driver too (384.59) before I downgraded to 381.09 (for another issue, related to temperature and fan going crazy). I will upgrade again to 384.
69 and report back if this issue still happens.
Perhaps someone has had this problem too? Please report back.
I use the Vulkan API renderer in Dota2, not OpenGL.

SPECS:
Antergos (Arch Linux)
Kernel 4.9.40-1-lts x86_64
CPU: Intel Core i7-3740QM CPU @ 3.7GHz
GPU: GeForce GTX 670MX [GK104M]
GPU Driver: nvidia proprietary 381.09
RAM: 16GB
MOBO: Clevo P370EM
Window manager: i3 (no compositor active)

Note that I use the nvidia-driver .run installer provided by nVidia.
Thanks for your time.
nvidia-bug-report.log.gz (327 KB)
nvidia-installer.log.gz (4.94 KB)

The crash happened with the latest 384.69 driver.

syslog output:

Sep 11 20:56:19 clevo kernel: NVRM: GPU at PCI:0000:01:00: GPU-3d2e1720-3ac2-b3d0-73cf-e44dab47add3
Sep 11 20:56:19 clevo kernel: NVRM: Xid (PCI:0000:01:00): 31, Ch 0000002c, engmask 00000111, intr 10000000
Sep 11 20:57:08 clevo crash_20170911191906_1.dmp[29450]: Uploading dump (out-of-process)
                                                         /tmp/dumps/crash_20170911191906_1.dmp
Sep 11 20:57:08 clevo kernel: Video Decode Th[20402]: segfault at 68 ip 00007f1837ccd1dc sp 00007f17ee3bae90 error 4 in libnvidia-glcore.so.384.69[7f1836895000+149b000]
Sep 11 20:57:08 clevo systemd[1]: Started Process Core Dump (PID 29451/UID 0).
Sep 11 20:57:13 clevo crash_20170911191906_1.dmp[29450]: Finished uploading minidump (out-of-process): success = yes
Sep 11 20:57:13 clevo crash_20170911191906_1.dmp[29450]: response: Discarded=1
Sep 11 20:57:13 clevo crash_20170911191906_1.dmp[29450]: file ''/tmp/dumps/crash_20170911191906_1.dmp'', upload yes: ''Discarded=1''
Sep 11 20:58:00 clevo systemd-coredump[29453]: Process 20375 (dota2) of user 1000 dumped core.
                                               
                                               Stack trace of thread 20402:
                                               #0  0x00007f1837ccd1dc n/a (libnvidia-glcore.so.384.69)
                                               #1  0x00007f18351e4e8b n/a (librendersystemvulkan.so)
                                               #2  0x00007f18351e5270 n/a (librendersystemvulkan.so)
                                               #3  0x00007f18351e55a5 n/a (librendersystemvulkan.so)
                                               #4  0x00007f183519f22b n/a (librendersystemvulkan.so)
                                               #5  0x00007f181a48f34d n/a (libpanorama.so)
                                               #6  0x00007f181a48fe3f n/a (libpanorama.so)
                                               #7  0x00007f18218f7a6b n/a (libclient.so)
                                               #8  0x00007f181d7b24f7 _ZN12CVideoPlayer24PresentDecodedVideoFrameEv (libvideo.so)
                                               #9  0x00007f181d7b7fb1 _ZN12CVideoPlayer9BRunFrameEv (libvideo.so)
                                               #10 0x00007f181d7ba5f6 _ZN19CVideoPlayerManager10ThreadFuncEb (libvideo.so)
                                               #11 0x00007f181d7bb1bd _ZN19CVideoPlayerManager12CVideoThread3RunEv (libvideo.so)
                                               #12 0x00007f181d7d7c5a _ZN16SteamThreadTools7CThread22ThreadExceptionWrapperEPv (libvideo.so)
                                               #13 0x00007f181d7d6b97 _ZN22CatchAndWriteContext_t6InvokeEv (libvideo.so)
                                               #14 0x00007f181d7d76aa CatchAndWriteMiniDumpExForVoidPtrFn (libvideo.so)
                                               #15 0x00007f181d7d8a3e _ZN16SteamThreadTools7CThread10ThreadProcEPv (libvideo.so)
                                               #16 0x00007f184185d049 start_thread (libpthread.so.0)
                                               #17 0x00007f1841b61f0f __clone (libc.so.6)
                                               
                                               Stack trace of thread 20375:
                                               #0  0x00007f18418631ad pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0)
                                               #1  0x00007f183db5fc32 _ZN17CThreadSyncObject8WaitImplEj (libtier0.so)
                                               #2  0x00007f183519b697 n/a (librendersystemvulkan.so)
                                               #3  0x00007f183519e4cd n/a (librendersystemvulkan.so)
                                               #4  0x00007f1835201462 n/a (librendersystemvulkan.so)
                                               #5  0x00007f18351e4924 n/a (librendersystemvulkan.so)
                                               #6  0x00007f183519a041 n/a (librendersystemvulkan.so)
                                               #7  0x00007f18351a036a n/a (librendersystemvulkan.so)
                                               #8  0x00007f183e1461c3 n/a (libengine2.so)
                                               #9  0x00007f183e13a3dc n/a (libengine2.so)
                                               #10 0x00007f183e13aa25 n/a (libengine2.so)
                                               #11 0x00007f183e12441e n/a (libengine2.so)
                                               #12 0x00007f183e127ed8 n/a (libengine2.so)
                                               #13 0x0000561b8c7a91ff n/a (dota2)
                                               #14 0x00007f1841a944ca __libc_start_main (libc.so.6)
                                               #15 0x0000561b8c7a9295 _start (dota2)

I’ve reported this to the Linux Dota2 team just in case: [Linux] [Nvidia] GPU crash after end of match · Issue #1314 · ValveSoftware/Dota-2 · GitHub

I will be using their OpenGL renderer instead of Vulkan starting from today and will report back if the crash happens again or not.

I think it’s safe to say it’s a problem with the Vulkan API, as it doesn’t seem to crash when using the OpenGL renderer.

LOL, it would be safer to say the issue is in whatever shitware provides librendersystemvulkan.so , probably steam.

Yeah, dota2 ships this library in “/dota 2 beta/game/bin/linuxsteamrt64/librendersystemvulkan.so” but the dota2 team seems to think it’s a problem with the nVidia driver.

Not sure how to progress further from there.

I asked the local Vulkan expert and he thinks this is probably bad out-of-memory handling. Do you have a lot of other apps that use a lot of video memory, and does the problem occur less often if you reduce the graphics settings in Dota 2?

Thanks for the reply. The only other program running is google-chrome, which does use some video memory.

I’ll give it another try over the next few weeks by having nothing running and lowering graphics settings in the game.