Multiple CUDA/RTX/Vulkan application crashing with Xid (13,109) errors

[141142.891254] NVRM: Xid (PCI:0000:01:00): 109, pid=36464, name=Warframe.x64.ex, Ch 000000ce, errorString CTX SWITCH TIMEOUT, Info 0x1c067

[51724.587479] NVRM: Xid (PCI:0000:01:00): 109, pid=53280, name=Warframe.x64.ex, Ch 00000096, errorString CTX SWITCH TIMEOUT, Info 0x2c05e

It happens consistently. Kernel 6.4.7, driver 535.86.05. X11, KDE. Also, my screen still flickers all the time.

nvidia-bug-report.log.gz (1.4 MB)

Kernel: 6.4.7
Driver: 535.86.05
GPU: 4070

Multiple games run on Proton (GE, 7.x, 8.x) work fine for minutes, sometimes hours but ultimately hang after this error. Example for EVE Online.
kernel: NVRM: Xid (PCI:0000:3e:00): 109, pid=1201042, name=exefile.exe, Ch 000000e6, errorString CTX SWITCH TIMEOUT, Info 0x86c06d

Has anyone found workarounds/fixes or identified what the latest provably functional version combination is? There seems to be uncertainty on this in the thread.

Nvidia driver version 520.56.06 compiled for kernel 6.0.19 should be stable. I am not able to get that driver to work on any later kernels in Arch Linux. Hopefully that works for your 4070.

I wasn’t able to boot with this combination. Thanks anyways.

@amrits are you able to share any information if NVIDIA has been able to pinpoint the culprit and find a possible remedy in the driver and if so, is there a timetable we can expect a fix? Thanks.

My new card, and I understand this is true for many others as well, is pretty much useless.

I am using Linux-tkg and nvidia-All packages from TKG here, if it helps. You should be able to pick kernel 6.0.19 and Nvidia 520.56.06 from there and build your stuff.

A kernel, driver that is that old might just not work for your new card as this combination is from October of 2022. Something obviously broke in either kernel versions 6.1+ or Nvidia driver version 525+.

Same problem here, driver version 535.86.05 (RTX 3080 Ti)
Arch Linux kernel 6.4.8-arch1-1
Horizon Zero Dawn with latest Proton Experimental (gamemoderun %command%), I have played for about 100 minutes with occasional freezes but now I can’t continue anymore. I am freezing in the first 1-2 minutes after loading up my savegame.
nvidia-bug-report.log.gz (869.8 KB)

[ 795.683492] NVRM: Xid (PCI:0000:2b:00): 31, pid=5972, name=HorizonZeroDawn, Ch 0000003a, intr 00000000. MMU Fault: ENGINE GRAPHICS GPCCLIENT_GCC faulted @ 0x0_00000000. Fault is of type FAULT_PDE ACCESS_TYPE_VIRT_READ
[ 795.695973] NVRM: Xid (PCI:0000:2b:00): 31, pid=5972, name=HorizonZeroDawn, Ch 0000003a, intr 00000000. MMU Fault: ENGINE GRAPHICS GPCCLIENT_T1_2 faulted @ 0x1a_00000000. Fault is of type FAULT_PDE ACCESS_TYPE_VIRT_READ

I used linux-tkg 6.0.19 and nvidia 520.56.06 with Horizon Zero Dawn and I can’t get past the main menu, dmesg:

[ 1143.235434] [drm:nv_drm_fence_context_create_ioctl [nvidia_drm]] ERROR [nvidia-drm] [GPU ID 0x00002b00] F
ailed to allocate fence signaling event

Sad, I’d hoped that it would work

Depressing isn’t it? We’re fast approaching this problem being a year old with no fix in sight. At this point it seems Nvidia has labeled this a “WON’T FIX”.

Yep, I see that they’ve released a new driver 11h ago, but it doesn’t mention this bug and I doubt that it’ll work…

Sure it does still not work. 535.98 updated and HorizonZeroDawn still freezes.
nvidia-bug-report.log.gz (399.9 KB)

This is the only game which I have trouble with. Spiderman Remastered - Last of Us (both Sony ports btw) all work like a charm.

Hate to do a +1, but I also have this issue (both 13 and 109) with Satisfactory. Only mentioning since no one did so far.

Kernel version 6.4.9 and Nvidia driver version 535.98 working well for me in games so far. I am no longer receiving 109 “CTX Switch Timeout” errors, but my G-Sync screen is still flickering at 144/165hz.

nvidia-bug-report.log.gz (2.2 MB)

Not here, unfortunately. Kernel 6.4.9 and nvidia 535.98:

$  sudo dmesg -T | tail -1
[Thu Aug 10 17:40:37 2023] NVRM: Xid (PCI:0000:01:00): 109, pid=14575, name=gstglcontext, Ch 00000081, errorString CTX SWITCH TIMEOUT, Info 0x1bc044

@amrits Do you know if you guys have a fix ready for this and if yes, do you plan to release it with driver version 545?

$ dmesg -T | grep Xid
[wo aug  9 21:37:06 2023] NVRM: Xid (PCI:0000:2d:00): 109, pid=1894800, name=ForzaHorizon4.e, Ch 00000080, errorString CTX SWITCH TIMEOUT, Info 0x3c03e
[wo aug  9 22:26:35 2023] NVRM: Xid (PCI:0000:2d:00): 109, pid=1947451, name=ForzaHorizon4.e, Ch 00000080, errorString CTX SWITCH TIMEOUT, Info 0x8c03e
[za aug 12 22:51:02 2023] NVRM: Xid (PCI:0000:2d:00): 109, pid=3956219, name=ForzaHorizon4.e, Ch 00000050, errorString CTX SWITCH TIMEOUT, Info 0x8c032
[zo aug 13 23:12:45 2023] NVRM: Xid (PCI:0000:2d:00): 109, pid=2499113, name=ForzaHorizon4.e, Ch 000000b8, errorString CTX SWITCH TIMEOUT, Info 0x8c04d
[ma aug 14 13:20:07 2023] NVRM: Xid (PCI:0000:2d:00): 109, pid=2706301, name=ForzaHorizon4.e, Ch 00000070, errorString CTX SWITCH TIMEOUT, Info 0x3c032
[ma aug 14 13:38:21 2023] NVRM: Xid (PCI:0000:2d:00): 109, pid=2774651, name=ForzaHorizon4.e, Ch 00000070, errorString CTX SWITCH TIMEOUT, Info 0x3c032
$ uname -a
Linux console 6.2.0-26-generic #26-Ubuntu SMP PREEMPT_DYNAMIC Mon Jul 10 23:39:54 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
$ lspci | grep -i vga
2d:00.0 VGA compatible controller: NVIDIA Corporation TU116 [GeForce GTX 1660 Ti] (rev a1)
$ nvidia-smi 
Mon Aug 14 14:23:59 2023       
| NVIDIA-SMI 535.86.05              Driver Version: 535.86.05    CUDA Version: 12.2     |
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  NVIDIA GeForce GTX 1660 Ti     Off | 00000000:2D:00.0  On |                  N/A |
|  0%   57C    P3              27W / 130W |   4381MiB /  6144MiB |     30%      Default |
|                                         |                      |                  N/A |

Still flickering in version 535.104.05.

Issue persists in version 535.104.05, though rare for me (this is the first one in a few days):

[145442.508839] NVRM: GPU at PCI:0000:01:00: GPU-3440fcd9-ad72-5684-052e-87619260bcbf
[145442.508842] NVRM: Xid (PCI:0000:01:00): 109, pid=125936, name=Warframe.x64.ex, Ch 000000be, errorString CTX SWITCH TIMEOUT, Info 0x4ac06d

Monitor still flickers intermittently. Reverting to Nvidia driver 520.56.06 and kernel 6.0.19, which at this point is a ridiculous work around.

nvidia-bug-report.log.gz (2.6 MB)

I can reproduce this pretty consistently when trying to launch a new game in Starfield.

Sep 01 02:42:58 arch-desktop kernel: NVRM: Xid (PCI:0000:0c:00): 109, pid=5974, name=Starfield.exe, Ch 000000ce, errorString CTX SWITCH TIMEOUT, Info 0x1c066

Arch Linux
Ryzen 3900x
Nvidia 3090 w/ 535.104.05-2 drivers
Proton Experimental

The error consistently occurs in starfield for anyone with the 535 stable and kernel 5.15+.

More notes found here: Starfield - Crashing during first loading screen. Seems related to view map pressure. · Issue #1678 · HansKristian-Work/vkd3d-proton · GitHub

The flickering is so annoying. Does anyone else experience this or is it me? Nvidia driver 535.104.05, kernel 6.4.12, Xorg-x11, single 144hz G-Sync monitor, Arch linux.