Xid109 CTX SWITCH TIMEOUT Driver Crashes In Many Applications

dylanlanigansmith · February 22, 2024, 6:45pm

Cannot use Linux for anything GPU heavy reliably in the last month or so… I have met many other users facing same issue and would like to bring it to light

Example of errors, always Xid 109:
NVRM: Xid (PCI:0000:01:00): 109, pid=168149, name=r5apex_dx12.exe, Ch 00000076, errorString CTX SWITCH TIMEOUT, Info 0x3c046
NVRM: Xid (PCI:0000:01:00): 109, pid=23382, name=cs2, Ch 000000b6, errorString CTX SWITCH TIMEOUT, Info 0x25c05d
NVRM: Xid (PCI:0000:01:00): 109, pid=‘’, name=, Ch 000000a6, errorString CTX SWITCH TIMEOUT, Info 0x26c058

Can consistently reproduce by playing ~1-2 games of CS2 Arms Race, the map Baggage will crash 90% of the time mid-game after a few minutes. Also has occured in compute heavy AI stuff, and in games like Apex Legends running through proton (interestingly, once Apex crashes after 10-45 mins, the game will not run for longer than 5 without another Xid 109 happening). Occasionally X11/KDE Plasma won’t recover from the crash and a full hard reboot on crash is required. This is so consistent that I can reboot, open nothing but Steam/Counter Strike 2, and have the game crash with Xid109 within 10 minutes, so testing fixes is easy.

Attempts to Debug:
-Went back to various kernel versions, that were stable for GPU usage when I used them last
-Tried 545.29.06, the beta 550.40.07, and the latest Vulkan Dev driver ( 535.43.09)
-Ensured things like power management, ReBar, etc. had no effect on reproducing the issue
-Had a friend with a 3060ti and near identical arch install (besides a Ryzen vs. my Intel, everything like driver version, graphics settings, resolution, vulkan/mesa stuff, and kernel were all the same between us) try to reproduce, and they could not
-Discussed with others also having the issue, they have tried countless other kernels, and have a variety of platforms that also are affected (AMD Ryzen, 40xx series as well, etc.), so my specific hardware is not the culprit
-Ensured my GPU is stable and in fully functional condition (passed GPU memory stress test with flying colors, can run heavy loads all night in Windows , ran stress tests, etc.)

Description of Crash
When the crash happens the screen freezes but audio, etc. continues to play in the background, and it takes ~15 seconds for the system to recover enough to alt-tab or switch terminals most of the time, with a hard (reset button) restart required occasionally. Sometimes in Proton apps the screen will freeze, then render a few frames after a few seconds, then freeze again, always with Xid 109 in dmesg after the crash. This happens independent of whether an app is run with DX11 or DX12 in Proton (all dxvk in the end), and with native Vulkan games like CS2. I have only had it happen during CUDA loads a few times but have not recently done any work with compute lately.

Bug report attached! I ran the bug tool immediately after reproducing the crash issue.
nvidia-bug-report.log.gz (937.6 KB)

I would really like to use my GPU again, so anything else I can do to help solve this would be greatly appreciated. I know there is a similar thread for this, however it is two years old and lacking any updates for this issue that renders Linux useless for the majority of my work and leisure activities.

Because I can consistently and quickly reproduce the crash, hopefully I can be of assistance pinpointing this issue, I am experienced with lowlevel debugging if I can get any dumps etc. that might help?

System info:

Arch Linux kernel 6.7.5, (other 6.6.x kernels also cause issue)
Nvidia Driver v.545.29.06 (other drivers also cause issue)
Plasma 5.27.10 through KWin
i7-12700k,
RTX 3090
MSI Z690A, 32gb DDR5,

cat /proc/cmdline                                                                                                                                                                                                                                                                                       ~
BOOT_IMAGE=/boot/vmlinuz-linux root=UUID=c1c6146b-63dc-46ff-84f3-e7661fed204d rw quiet loglevel=3 ibt=off split_lock_detect=off nvidia_drm.modeset=1

cat /proc/driver/nvidia/params                                                                                                                                                                                                                                                                           ~
ResmanDebugLevel: 4294967295
RmLogonRC: 1
ModifyDeviceFiles: 1
DeviceFileUID: 0
DeviceFileGID: 0
DeviceFileMode: 438
InitializeSystemMemoryAllocations: 1
UsePageAttributeTable: 1
EnableMSI: 1
EnablePCIeGen3: 0
MemoryPoolSize: 0
KMallocHeapMaxSize: 0
VMallocHeapMaxSize: 0
IgnoreMMIOCheck: 0
TCEBypassMode: 0
EnableStreamMemOPs: 0
EnableUserNUMAManagement: 1
NvLinkDisable: 0
RmProfilingAdminOnly: 1
PreserveVideoMemoryAllocations: 0
EnableS0ixPowerManagement: 0
S0ixPowerManagementVideoMemoryThreshold: 256
DynamicPowerManagement: 3
DynamicPowerManagementVideoMemoryThreshold: 200
RegisterPCIDriver: 1
EnablePCIERelaxedOrderingMode: 0
EnableResizableBar: 1
EnableGpuFirmware: 18
EnableGpuFirmwareLogs: 2
EnableDbgBreakpoint: 0
OpenRmEnableUnsupportedGpus: 1
DmaRemapPeerMmio: 1
RegistryDwords: ""
RegistryDwordsPerDevice: ""
RmMsg: ""
GpuBlacklist: ""
TemporaryFilePath: ""
ExcludedGpus: ""

Thank you for any assistance, this is becoming incredibly frustrating.

dylanlanigansmith · February 24, 2024, 7:37pm

Tried updated driver 545.29.06-20.
Can reproduce issue within 5 minutes of playing CS2.

NVRM: Xid (PCI:0000:01:00): 109, pid=5408, name=cs2, Ch 00000096, errorString CTX SWITCH TIMEOUT, Info 0x56c05f

Bug report from immediately after crash attached.
nvidia-bug-report.log.gz (742.7 KB)

Because of my ability to reproduce this issue I was hoping to hear some potential solutions or versions to try as I am easily able to confirm if they are effective in remedying these XID 109 driver crashes.

dylanlanigansmith · February 24, 2024, 8:00pm

And on latest driver, 550.54.14, can reproduce just as easily. Kernel 6.7.6-arch1-1.

Xid (PCI:0000:01:00): 109, pid='<unknown>', name=<unknown>, Ch 0000008e, errorString CTX SWITCH TIMEOUT, Info 0x26c047

This time I ran the bug report tool before killing the offending GPU using app (CS2)
nvidia-bug-report.log.gz (795.3 KB)

hennikul · February 27, 2024, 11:40am

I just experienced the same crash here in CS2. Running 550 driver in Ubuntu 23.10.

My card is a brand new 4070 Super, that will be used mostly for OpenCL stuff related to photo editing, but so far all heavy GPU tasks have caused failures.

When OpenCL fails I se errors like this:
[ 266.228441] NVRM: GPU at PCI:0000:0a:00: GPU-617ca489-a0c6-4820-a5d8-bb47f1f232bf
[ 266.228448] NVRM: Xid (PCI:0000:0a:00): 31, pid=8469, name=worker 3, Ch 00000008, intr 00000000. MMU Fault: ENGINE GRAPHICS GPCCLIENT_T1_2 faulted @ 0x500_00233000. Fault is of type FAULT_PDE ACCESS_TYPE_VIRT_READ
[36272.668229] NVRM: Xid (PCI:0000:0a:00): 13, pid=‘’, name=, Graphics SM Warp Exception on (GPC 3, TPC 1, SM 0): Out Of Range Address
[36272.668249] NVRM: Xid (PCI:0000:0a:00): 13, pid=‘’, name=, Graphics Exception: ESR 0x51cf30=0x101000e 0x51cf34=0x20 0x51cf28=0xf81eb60 0x51cf2c=0x1174
[36272.668882] NVRM: Xid (PCI:0000:0a:00): 43, pid=20472, name=test_basic, Ch 00000030
38704.375178] NVRM: Xid (PCI:0000:0a:00): 31, pid=‘’, name=, Ch 00000038, intr 00000000. MMU Fault: ENGINE GRAPHICS GPCCLIENT_T1_2 faulted @ 0x7fba_1cac2000. Fault is of type FAULT_PDE ACCESS_TYPE_VIRT_WRITE

Processing: nvidia-bug-report.log.gz…

bling1987 · August 13, 2024, 6:50am

I am experiencing the same “errorString CTX SWITCH TIMEOUT” when playing Path Of Exile 3.25

nvidia-bug-report.log.gz (871.7 KB)

I can try DX11, Vulkan and all do the same thing. It might work for hour or 2 and then constant crashes. Reboot might give me 1-2 hours play time again before it starts crashing.

I can also replicate it with some events in the game. Spider lair map boss room is known to crash.

bling1987 · August 13, 2024, 7:04am

I forgot to mention. I can play World of Tanks with no problems and not experience crashes.

amrits · August 13, 2024, 3:57pm

Hi @bling1987
Could you please test playing Path of Exile 3.25 with 560 beta driver and share test results.
If issue persists, could you save the map and share the save file with me? That way, I can load the same environment on my end.

bling1987 · August 13, 2024, 7:39pm

4x King of the Mist fights and 3 times it crashed.

nvidia-bug-report.log.gz (1.3 MB)

Can’t find a way to install the 560 Beta on PopOS.

bling1987 · August 13, 2024, 8:09pm

Beta install failed.
nvidia-installer.log (179.5 KB)

bling1987 · August 14, 2024, 9:01am

Crashed in “The Maven Crucible” fighting the “The Hidden” for bosses.

nvidia-bug-report.log.gz (824.3 KB)

ventilhac · November 3, 2024, 11:05pm

Horizon Zero down and Control Ultimate Edition…
nvidia-bug-report.log.gz (860.6 KB)

Topic		Replies	Views
Multiple CUDA/RTX/Vulkan application crashing with Xid (13,109) errors Linux	435	39529	December 23, 2024
NVRM Xid error 59 with Kepler card (CUDA) on 4th PCIe 3.0 port Linux	6	4936	July 2, 2013
X hangs using 100% CPU, WAIT and mieq overflowing errors in logs Linux	67	23557	June 28, 2014
Reproducible: NVRM: GPU at 0000:01:00.0 has fallen off the bus. -- Both screens black, Xorg at 100% Linux	24	50919	December 16, 2015
GTX 1070 "GPU has fallen off the bus" running 3D games in Arch Linux Linux	15	7846	March 19, 2020
GeForce GT 730 random colorful X crashes Linux	3	2543	October 13, 2015
GTX 970 with KDE/KWIN :NVRM: Xid (PCI:0000:01:00): 31, Ch 00000028, engmask 0000... Linux	32	7301	May 3, 2018
Screen/system is dead on resume (unable to resume with all current drivers) Linux	57	19990	February 25, 2017
Frequent Freeze/Crash of Xorg with drivers 310.19 with GTS 250 on 3.2.0-4-amd64 Linux	20	15915	June 25, 2013
GPU timeout \| lockup Linux	14	1026	July 7, 2024

Xid109 CTX SWITCH TIMEOUT Driver Crashes In Many Applications

Related topics