Multiple CUDA/RTX/Vulkan application crashing with Xid (13,109) errors

I’m trying to run it with Proton. Whether it’s Proton Experimental or GE-Proton, it doesn’t matter, even after setting VKD3DC_CONFIG=dxr11, it pops up the dialog, and with anything Wine 8.x based I can’t interact with the dialog. With GE-Proton 7.55, I can click the button, but then the game crashes, and it’s an older VKD3D, so I guess I’m not surprised.

Yeah, I don’t think this issue is fixed yet either. Testing so far has not exactly yielded positive results.

Those for whom it does not work maybe are best advised to add at their nvidia-bug-report.
It might be that you guys are running in Xid 31 or 109 but the cause is different.

Some examples are:

What was solved for me is CTX_SWTICH_TIMEOUT. See if this is also the case for you.

While it’s nice that this has been fixed for you, the issue is still far from solved. I have been seeing this issue for over eight months. It has been better or worse as drivers get updated but it has always been a problem from a long while now.

The user catfishk was right, something significant broke and it has made gaming on Nvidia impossible. Nvidia’s response has also been lack luster.

With 535 CTX SWITCH TIMEOUT occurs immediately on the menu screen of sf6. It will occur loading up a new game of Metro Exodus EE. It occurs a short time after loading up the Last of Us.

NVRM: Xid (PCI:0000:0c:00): 109, pid=6146, name=StreetFighter6., Ch 00000062, errorString CTX SWITCH TIMEOUT, Info 0x1c060

Unless the game is older or less demanding this error, or a similar catastrophic driver event, occurs after a short amount of time.

I’ve attached a bug report, but in all honesty I’m not sure there is a point. I can’t vouch for other users but personally I feel defeated.

nvidia-bug-report.log.gz (859.9 KB)

1 Like

I know it’s persistent for a long long time now (started this thread).

Since you mentioned Metro Exodus EE it is super weird as with the latest driver update it was solved for me see here.

Just re-validated this upon writing this comment.
Everything maxed out and also I am encoding a video on the GPU at the same time rn.
This seems to be very taxing scenario to run an RTX titles with all RTX features turned on also encoding a video on the GPU rendering the game.

Which makes me believe that this issue seems not very easy to track down if we get such different results running the same games.
I mean the two of us even run the 6.3.9 Kernel :(

Still need to check out SF6 and The Last Of Us tbh.

If you start the game via steam, start it with the following parameters
PROTON_HIDE_NVIDIA_GPU=0 PROTON_ENABLE_NVAPI=1 VKD3D_CONFIG=dxr11 VKD3D_FEATURE_LEVEL=12_2 gamemoderun mangohud %command%
That takes care of the message for me. Sadly it still doesnt fix the Xid 109 Error

1 Like

With 535, both Last of Us Part 1 and Metro Exodus Enhanced Edition seem to be way more stable. I haven’t had a crash in those two games yet. However, like @alxander.lohr , I am still getting crashes in Horizon Zero Dawn.

kernel: NVRM: Xid (PCI:0000:0c:00): 31, pid=19430, name=HorizonZeroDawn, Ch 00000097, intr 00000000. MMU Fault: ENGINE GRAPHICS GPCCLIENT_GCC faulted @ 0x0_00000000. Fault is of type FAULT_PDE ACCESS_TYPE_VIRT_READ

Happened again while playing Wasteland 3. I’m on 535.54.03 now as well.

[ 1313.129229] NVRM: GPU at PCI:0000:01:00: GPU-d94a8dd3-7c3d-ecc0-033e-1ec177ee9969
[ 1313.129232] NVRM: Xid (PCI:0000:01:00): 109, pid=3174, name=WL3.exe, Ch 00000016, errorString CTX SWITCH TIMEOUT, Info 0x1c00d

Not sure what to do at this point. I’ve attached nvidia-bug-report output as well.

nvidia-bug-report.log.gz (894.9 KB)

It kind of blows as I was really looking forward to progressing in this game.

Tried Metro Exodus Enhanced again with driver version 535.54.03.
Game starts and runs. But gets very heavy artifacting now. Tried different proton version. Nothing made a difference.

OS: Manjaro
KERNEL: 6.1.38-1 LTS
CPU: AMD Ryzen 7 2700X
GPU: NVIDIA GeForce RTX 3070
GPU DRIVER: NVIDIA 535.54.03
RAM: 32 GB

nvidia-bug-report.log.gz (678.1 KB)

So after running the Horizon Zero Dawn Benchmark 2 times in a row with max settings and VSync disabled I got a new Xid 31:

kernel: NVRM: GPU at PCI:0000:26:00: GPU-6f98b267-20cc-5347-51dc-8bad07fd2ad0
NVRM: Xid (PCI:0000:26:00): 31, pid=6978, name=HorizonZeroDawn, Ch 00000066, intr 00000000. MMU Fault: ENGINE GRAPHICS GPCCLIENT_GCC faulted @ 0x4f90_00000000. Fault is of type FAULT_PDE ACCESS_TYPE_VIRT_READ

The fault FAULT_PDE ACCESS_TYPE_VIRT_READ I’ve seen in other games with an older driver but the first time with the current one (535.54.03).

Getting the same error just as like UrbenLegend

nvidia-bug-report.log.gz (928.3 KB)

Can you try with this is env variable?

__GL_b5f2b3=0xFFFFFFFF

It seems to fix all the the artifacting. Around 1 hour of runtime without any issues so far. No signs of Xid 109 or Xid 31 Errors. All RTX features work and perfomance is good. Havent encounter the Xid 109 in any other game so far aswell.

This is the list of starting parameters for the steam version

PROTON_HIDE_NVIDIA_GPU=0 PROTON_ENABLE_NVAPI=1 VKD3D_CONFIG=dxr11 VKD3D_FEATURE_LEVEL=12_2 __GL_b5f2b3=0xFFFFFFFF gamemoderun mangohud %command%

Just for good measure here the link for the by @VulkanGuy mentioned fix.

OS: Manjaro
KERNEL: 6.1.38-1 LTS
CPU: AMD Ryzen 7 2700X
GPU: NVIDIA GeForce RTX 3070
GPU DRIVER: NVIDIA 535.54.03
PROTON VERSION: GE-PROTON8-6
RAM: 32 GB

I face the same issue for Horizon Zero Dawn.

I can quite reliably reproduce the error, sometimes within a few seconds of starting the game after loading.

[ 6507.843486] NVRM: Xid (PCI:0000:01:00): 31, pid=48902, name=HorizonZeroDawn, Ch 00000066, intr 00000000. MMU Fault: ENGINE GRAPHICS GPCCLIENT_GCC faulted @ 0x0_00000000. Fault is of type FAULT_PDE ACCESS_TYPE_VIRT_READ
[ 6746.081041] NVRM: Xid (PCI:0000:01:00): 31, pid=50223, name=HorizonZeroDawn, Ch 00000066, intr 00000000. MMU Fault: ENGINE GRAPHICS GPCCLIENT_GCC faulted @ 0x0_00000000. Fault is of type FAULT_PDE ACCESS_TYPE_VIRT_READ

Unfortunately, the fix mentioned by VulkanGuy doesn’t work for this case, it seems.

nvidia-bug-report.log.gz (461.8 KB)

If I set the resolution to 4K, DLSS performance, and set all settings to ultra, I can reproduce the issue every time (usually within 10 seconds of loading my save)

Currently for me, kernel version 6.4.2 and Nvidia driver version 535.54.03 are working to avoid the CTX Switch Timeout errors… However this combination is causing my screen to flicker quite often.

It may be unrelated, but I do now get an error pop up in attempting to run Deep Rock Galactic: “Out of video memory trying to allocate a rendering resource. Make sure your video card has the minimum required memory, try lowering the resolution and/or closing other applications that are running. Exiting…”. Restarting the game fixes it.

RTX 2060 Super 8 GB

Hi,
I’ve avidly followed this topic as, for the first time in my life with NVidia, I’ve encountered this unfamous issue.
NVRM: Xid (PCI:0000:01:00): 31, pid=9037, name=HorizonZeroDawn, Ch 00000046, intr 00000000. MMU Fault: ENGINE GRAPHICS GPCCLIENT_GCC faulted @ 0x0_00000000. Fault is of type FAULT_PDE ACCESS_TYPE_VIRT_READ

Context : Ubuntu 23.04, Core i5-13600KF, ASUS TUF 3060 12GB, 64GB of DDR5 RAM. Using Steam with Experimental Proton.
Kernel 6.2.0-25-Generic
NVidia Driver Version 525.125.06
NVML Version 12.525.125.06

I’ve tried all the “tricks” described in this thread (and the other related threads too). I’ve even downgraded the NVidia driver from 535 to 525.

Perhaps I’ve been too greedy as I try to play at 1440p, VSYNC, no IPS limitations, Ultra quality, DLSS.

With the 535 driver, the game crashes during the benchmark. Now, with the 525, I can launch the benchmark everytime without having any problem.
But then, I try to play, and within 2 minutes, the screen freezes (not the sound). And finally, I can close the application (else, the whole system is frozen).

Since it seems to be a CUDA issue, I thought that it would be a good idea to leave the GPU alone. So I’ve tried to lower the details, deactivate DLSS and so on. And it works, to a certain extent.
I was able to play for 3h in a raw one time !
But it was just once. Then, the issue came back.
I’ve even tried to rollback to an older kernel (after all, it was working in the good ol’ time, wasn’t it ?) but without success.

Perhaps my 3060 got fried on the corners … Dunno. I hope I’ll contribute to (high level) diagnosis the issue.

However this combination is causing my screen to flicker quite often.

The flickering is a known bug, I’m using 535.86.05 and seems fixed with this one.

The flickering is a known bug, I’m using 535.86.05 and seems fixed with this one.

The new driver version 535.86.05 was flickering for me, until I set Wayland’s adaptive sync option from “automatic” to “always”. Now the flicker seems to have disappeared, even at 165 hz.

EDIT: Never mind, the flickering is back at 165 hz/144 hz.

EDIT: X11 is the same flickering

EDIT: I also get noticeable coil whine at over 60 hz, when I move the mouse, in X11

1 Like

I update to 535.86.05 but still …

[ 5669.058292] NVRM: Xid (PCI:0000:01:00): 31, pid=38802, name=HorizonZeroDawn, Ch 0000003e, intr 00000000. MMU Fault: ENGINE GRAPHICS GPCCLIENT_GCC faulted @ 0x0_00000000. Fault is of type FAULT_PDE ACCESS_TYPE_VIRT_READ

535.86.05 issue still present and very easy to reproduce.

This time I had only to run the Horizon Benchmark once and after it finished and should return to the menu the game has frozen with:

kernel: NVRM: Xid (PCI:0000:26:00): 31, pid=913, name=HorizonZeroDawn, Ch 00000046, intr 00000000. MMU Fault: ENGINE GRAPHICS GPCCLIENT_GCC faulted @ 0x0_00000000. Fault is of type FAULT_PDE ACCESS_TYPE_VIRT_READ

nvidia-bug-report.log.gz (736.1 KB)

Same issue, Horizon Zero Dawn, 535.86.05:

[48731.652778] NVRM: Xid (PCI:0000:01:00): 31, pid=92811, name=HorizonZeroDawn, Ch 00000056, intr 00000000. MMU Fault: ENGINE GRAPHICS GPCCLIENT_GCC faulted @ 0x0_00000000. Fault is of type FAULT_PDE ACCESS_TYPE_VIRT_READ

nvidia-bug-report.log.gz (963.9 KB)