Multiple CUDA/RTX/Vulkan application crashing with Xid (13,109) errors

With 535, both Last of Us Part 1 and Metro Exodus Enhanced Edition seem to be way more stable. I haven’t had a crash in those two games yet. However, like @alxander.lohr , I am still getting crashes in Horizon Zero Dawn.

kernel: NVRM: Xid (PCI:0000:0c:00): 31, pid=19430, name=HorizonZeroDawn, Ch 00000097, intr 00000000. MMU Fault: ENGINE GRAPHICS GPCCLIENT_GCC faulted @ 0x0_00000000. Fault is of type FAULT_PDE ACCESS_TYPE_VIRT_READ

Happened again while playing Wasteland 3. I’m on 535.54.03 now as well.

[ 1313.129229] NVRM: GPU at PCI:0000:01:00: GPU-d94a8dd3-7c3d-ecc0-033e-1ec177ee9969
[ 1313.129232] NVRM: Xid (PCI:0000:01:00): 109, pid=3174, name=WL3.exe, Ch 00000016, errorString CTX SWITCH TIMEOUT, Info 0x1c00d

Not sure what to do at this point. I’ve attached nvidia-bug-report output as well.

nvidia-bug-report.log.gz (894.9 KB)

It kind of blows as I was really looking forward to progressing in this game.

Tried Metro Exodus Enhanced again with driver version 535.54.03.
Game starts and runs. But gets very heavy artifacting now. Tried different proton version. Nothing made a difference.

OS: Manjaro
KERNEL: 6.1.38-1 LTS
CPU: AMD Ryzen 7 2700X
GPU: NVIDIA GeForce RTX 3070
GPU DRIVER: NVIDIA 535.54.03
RAM: 32 GB

nvidia-bug-report.log.gz (678.1 KB)

So after running the Horizon Zero Dawn Benchmark 2 times in a row with max settings and VSync disabled I got a new Xid 31:

kernel: NVRM: GPU at PCI:0000:26:00: GPU-6f98b267-20cc-5347-51dc-8bad07fd2ad0
NVRM: Xid (PCI:0000:26:00): 31, pid=6978, name=HorizonZeroDawn, Ch 00000066, intr 00000000. MMU Fault: ENGINE GRAPHICS GPCCLIENT_GCC faulted @ 0x4f90_00000000. Fault is of type FAULT_PDE ACCESS_TYPE_VIRT_READ

The fault FAULT_PDE ACCESS_TYPE_VIRT_READ I’ve seen in other games with an older driver but the first time with the current one (535.54.03).

Getting the same error just as like UrbenLegend

nvidia-bug-report.log.gz (928.3 KB)

Can you try with this is env variable?

__GL_b5f2b3=0xFFFFFFFF

It seems to fix all the the artifacting. Around 1 hour of runtime without any issues so far. No signs of Xid 109 or Xid 31 Errors. All RTX features work and perfomance is good. Havent encounter the Xid 109 in any other game so far aswell.

This is the list of starting parameters for the steam version

PROTON_HIDE_NVIDIA_GPU=0 PROTON_ENABLE_NVAPI=1 VKD3D_CONFIG=dxr11 VKD3D_FEATURE_LEVEL=12_2 __GL_b5f2b3=0xFFFFFFFF gamemoderun mangohud %command%

Just for good measure here the link for the by @VulkanGuy mentioned fix.

OS: Manjaro
KERNEL: 6.1.38-1 LTS
CPU: AMD Ryzen 7 2700X
GPU: NVIDIA GeForce RTX 3070
GPU DRIVER: NVIDIA 535.54.03
PROTON VERSION: GE-PROTON8-6
RAM: 32 GB

I face the same issue for Horizon Zero Dawn.

I can quite reliably reproduce the error, sometimes within a few seconds of starting the game after loading.

[ 6507.843486] NVRM: Xid (PCI:0000:01:00): 31, pid=48902, name=HorizonZeroDawn, Ch 00000066, intr 00000000. MMU Fault: ENGINE GRAPHICS GPCCLIENT_GCC faulted @ 0x0_00000000. Fault is of type FAULT_PDE ACCESS_TYPE_VIRT_READ
[ 6746.081041] NVRM: Xid (PCI:0000:01:00): 31, pid=50223, name=HorizonZeroDawn, Ch 00000066, intr 00000000. MMU Fault: ENGINE GRAPHICS GPCCLIENT_GCC faulted @ 0x0_00000000. Fault is of type FAULT_PDE ACCESS_TYPE_VIRT_READ

Unfortunately, the fix mentioned by VulkanGuy doesn’t work for this case, it seems.

nvidia-bug-report.log.gz (461.8 KB)

If I set the resolution to 4K, DLSS performance, and set all settings to ultra, I can reproduce the issue every time (usually within 10 seconds of loading my save)

Currently for me, kernel version 6.4.2 and Nvidia driver version 535.54.03 are working to avoid the CTX Switch Timeout errors… However this combination is causing my screen to flicker quite often.

It may be unrelated, but I do now get an error pop up in attempting to run Deep Rock Galactic: “Out of video memory trying to allocate a rendering resource. Make sure your video card has the minimum required memory, try lowering the resolution and/or closing other applications that are running. Exiting…”. Restarting the game fixes it.

RTX 2060 Super 8 GB

Hi,
I’ve avidly followed this topic as, for the first time in my life with NVidia, I’ve encountered this unfamous issue.
NVRM: Xid (PCI:0000:01:00): 31, pid=9037, name=HorizonZeroDawn, Ch 00000046, intr 00000000. MMU Fault: ENGINE GRAPHICS GPCCLIENT_GCC faulted @ 0x0_00000000. Fault is of type FAULT_PDE ACCESS_TYPE_VIRT_READ

Context : Ubuntu 23.04, Core i5-13600KF, ASUS TUF 3060 12GB, 64GB of DDR5 RAM. Using Steam with Experimental Proton.
Kernel 6.2.0-25-Generic
NVidia Driver Version 525.125.06
NVML Version 12.525.125.06

I’ve tried all the “tricks” described in this thread (and the other related threads too). I’ve even downgraded the NVidia driver from 535 to 525.

Perhaps I’ve been too greedy as I try to play at 1440p, VSYNC, no IPS limitations, Ultra quality, DLSS.

With the 535 driver, the game crashes during the benchmark. Now, with the 525, I can launch the benchmark everytime without having any problem.
But then, I try to play, and within 2 minutes, the screen freezes (not the sound). And finally, I can close the application (else, the whole system is frozen).

Since it seems to be a CUDA issue, I thought that it would be a good idea to leave the GPU alone. So I’ve tried to lower the details, deactivate DLSS and so on. And it works, to a certain extent.
I was able to play for 3h in a raw one time !
But it was just once. Then, the issue came back.
I’ve even tried to rollback to an older kernel (after all, it was working in the good ol’ time, wasn’t it ?) but without success.

Perhaps my 3060 got fried on the corners … Dunno. I hope I’ll contribute to (high level) diagnosis the issue.

However this combination is causing my screen to flicker quite often.

The flickering is a known bug, I’m using 535.86.05 and seems fixed with this one.

The flickering is a known bug, I’m using 535.86.05 and seems fixed with this one.

The new driver version 535.86.05 was flickering for me, until I set Wayland’s adaptive sync option from “automatic” to “always”. Now the flicker seems to have disappeared, even at 165 hz.

EDIT: Never mind, the flickering is back at 165 hz/144 hz.

EDIT: X11 is the same flickering

EDIT: I also get noticeable coil whine at over 60 hz, when I move the mouse, in X11

1 Like

I update to 535.86.05 but still …

[ 5669.058292] NVRM: Xid (PCI:0000:01:00): 31, pid=38802, name=HorizonZeroDawn, Ch 0000003e, intr 00000000. MMU Fault: ENGINE GRAPHICS GPCCLIENT_GCC faulted @ 0x0_00000000. Fault is of type FAULT_PDE ACCESS_TYPE_VIRT_READ

535.86.05 issue still present and very easy to reproduce.

This time I had only to run the Horizon Benchmark once and after it finished and should return to the menu the game has frozen with:

kernel: NVRM: Xid (PCI:0000:26:00): 31, pid=913, name=HorizonZeroDawn, Ch 00000046, intr 00000000. MMU Fault: ENGINE GRAPHICS GPCCLIENT_GCC faulted @ 0x0_00000000. Fault is of type FAULT_PDE ACCESS_TYPE_VIRT_READ

nvidia-bug-report.log.gz (736.1 KB)

Same issue, Horizon Zero Dawn, 535.86.05:

[48731.652778] NVRM: Xid (PCI:0000:01:00): 31, pid=92811, name=HorizonZeroDawn, Ch 00000056, intr 00000000. MMU Fault: ENGINE GRAPHICS GPCCLIENT_GCC faulted @ 0x0_00000000. Fault is of type FAULT_PDE ACCESS_TYPE_VIRT_READ

nvidia-bug-report.log.gz (963.9 KB)

[141142.891254] NVRM: Xid (PCI:0000:01:00): 109, pid=36464, name=Warframe.x64.ex, Ch 000000ce, errorString CTX SWITCH TIMEOUT, Info 0x1c067

[51724.587479] NVRM: Xid (PCI:0000:01:00): 109, pid=53280, name=Warframe.x64.ex, Ch 00000096, errorString CTX SWITCH TIMEOUT, Info 0x2c05e

It happens consistently. Kernel 6.4.7, driver 535.86.05. X11, KDE. Also, my screen still flickers all the time.

nvidia-bug-report.log.gz (1.4 MB)

Kernel: 6.4.7
Driver: 535.86.05
GPU: 4070

Multiple games run on Proton (GE, 7.x, 8.x) work fine for minutes, sometimes hours but ultimately hang after this error. Example for EVE Online.
kernel: NVRM: Xid (PCI:0000:3e:00): 109, pid=1201042, name=exefile.exe, Ch 000000e6, errorString CTX SWITCH TIMEOUT, Info 0x86c06d

Has anyone found workarounds/fixes or identified what the latest provably functional version combination is? There seems to be uncertainty on this in the thread.

Nvidia driver version 520.56.06 compiled for kernel 6.0.19 should be stable. I am not able to get that driver to work on any later kernels in Arch Linux. Hopefully that works for your 4070.

I wasn’t able to boot with this combination. Thanks anyways.

@amrits are you able to share any information if NVIDIA has been able to pinpoint the culprit and find a possible remedy in the driver and if so, is there a timetable we can expect a fix? Thanks.

My new card, and I understand this is true for many others as well, is pretty much useless.

I am using Linux-tkg and nvidia-All packages from TKG here, if it helps. You should be able to pick kernel 6.0.19 and Nvidia 520.56.06 from there and build your stuff.

A kernel, driver that is that old might just not work for your new card as this combination is from October of 2022. Something obviously broke in either kernel versions 6.1+ or Nvidia driver version 525+.

Same problem here, driver version 535.86.05 (RTX 3080 Ti)
Arch Linux kernel 6.4.8-arch1-1
Horizon Zero Dawn with latest Proton Experimental (gamemoderun %command%), I have played for about 100 minutes with occasional freezes but now I can’t continue anymore. I am freezing in the first 1-2 minutes after loading up my savegame.
nvidia-bug-report.log.gz (869.8 KB)

[ 795.683492] NVRM: Xid (PCI:0000:2b:00): 31, pid=5972, name=HorizonZeroDawn, Ch 0000003a, intr 00000000. MMU Fault: ENGINE GRAPHICS GPCCLIENT_GCC faulted @ 0x0_00000000. Fault is of type FAULT_PDE ACCESS_TYPE_VIRT_READ
[ 795.695973] NVRM: Xid (PCI:0000:2b:00): 31, pid=5972, name=HorizonZeroDawn, Ch 0000003a, intr 00000000. MMU Fault: ENGINE GRAPHICS GPCCLIENT_T1_2 faulted @ 0x1a_00000000. Fault is of type FAULT_PDE ACCESS_TYPE_VIRT_READ