Multiple CUDA/RTX/Vulkan application crashing with Xid (13,109) errors

Same Xid 109 here, for months now!!!
-Horizon Zero Down
-Control, with DX11 and DX12, all graphics settings
I don’t remember so many freezes when I started to play Control in september 2023…
I tried with 535 and 550
nvidia-bug-report.log.gz (860.6 KB)

nvidia-bug-report.log.gz (2.0 MB)

[ 2040.320708] NVRM: Xid (PCI:0000:07:00): 109, pid='<unknown>', name=<unknown>, Ch 00000044, errorString CTX SWITCH TIMEOUT, Info 0x22c054

Cyberpunk 2077 on current 565 driver, easily reproducible, sometimes game plays for some time (~30 mins) and sometimes it crashes after one or two minutes.

OS: Garuda Linux x86_64
Kernel: 6.11.6-zen1-1-zen
Resolution: 1920x1080 5120x1440
DE: KDE Plasma 6.2.2
WM: KWin (Wayland)
CPU: Intel(R) Core(TM) i7-8750H (12) @ 4.10 GHz
GPU: NVIDIA GeForce GTX 1050 Ti Mobile
GPU: NVIDIA GeForce RTX 3080 Lite Hash Rate
GPU: Intel UHD Graphics 630 @ 1.10 GHz [Integrated]

wow, and now even on Kentucky Route Zero
bugfix please!!!

Confirming that a swap to nvidia-open-dkms 565.57.01-2 from nvidia-dkms made the xid 109 errors disappear in Cyberpunk 2077 with no other changes.

Installed ACValhalla via Ubisoft Connect and played mulitple times but did not observe any crash or Xid 109 error on couple of test systems.
ASRock TRX40 Taichi + Ubuntu 24.04 LTS + kernel 6.8.0-48-generic + NVIDIA GeForce RTX 4080 + Driver 560.35.03
Dell T610 + Ubuntu 24.04 LTS + kernel 6.8.0-48-generic + NVIDIA GeForce RTX 2080 + Driver 560.35.03

It would be helpful to know about repro frequency and any particular map or location where issue is hitting frequently.

Thanks for checking.
I can only share my experience with Valhalla as of recently, with 560 under Wayland it seems to work fine, I played quite a lot without any crashes.

Perhaps @Lifeismana or @tellusEidolon can provide more information.

I was running it from steam, i’m not sure if there’s any difference from running it via ubisoft connect (steam’s overlay hooking itself & proton version not being the same? proton experimental for me) maybe adding it as a non steam game could help repro that issue

sadly i’m unable to rollback to 560 and i can’t test it on 565 since there’s the bad artifacting (that are fixed in the next release afaik)

From my past testing, this bug really seems to be caused by cpu processing taking too much time, with a 7800x3D i had 1 or 2 freeze to death while in game (these freezes didn’t throw any error in dmesg) but once i switched to a 7600X i had a LOT more freezes, like almost everytime i was in a crowded place or loading one

Maybe limiting the number of cores of the game’s processes (to 4 or 6) could help reproduce that

(fwiw the error i always get that is similar to this one but isn’t this one, is the game freezing on the first load unless i alt-tab (this is true every time i start the game even after the first scene))

as mention i can’t try to reproduce that bug until the next driver releases but my setup was something like
MSI X670E + Arch Linux + kernel 6.11 (zen version) + NVIDIA GeForce RTX 2080 + Driver 560.35.03

It happens again on 565 with Baldur’s Gate 3 (both vulkan and dx11) and probably all the other proton-run games I’ve had issues with.
OS: EndeavourOS Linux x86_64
Board: B550I AORUS PRO AX
Kernel: 6.11.7-arch1-1
hwinfo.txt (4.7 KB)
nvidia-bug-report.log.gz (590.0 KB)

In Horizon ZD: it seems it could happen at any moment, but especially when in scan vision.
Tried with 560, freeze also happens

@amrits I finally hit a XID when playing Valhalla but it’s very rare, I played for ~2 hours.
nvidia-bug-report.log.gz (1.8 MB)

@amrits
another one, music continued to play but game screen itself froze.
nvidia-bug-report.log.gz (1.6 MB)

Update: Seems to happen a lot in Werham village, dunno if it’s related to the quest or if it’s that area in general.
Also, forgot to mention, I don’t have an FPS cap set in game, I cap it with mangohud @138 FPS

1 Like

Hardware: 3080 RTX 10GB, 5800X3D
Software: EndeavorOS / nvidia-open-dkms 565.77

Getting this 109 CTX TIMEOUT error with WoW randomly. Running through Lutris w/Proton Experimental as the runner. I am able to Ctrl-Alt-F2 to get to a TTY and kill the process at least, but it will happen fast if I don’t reboot before launching the game again.

It first happened under X11 + KDE, so I tried Wayland + KDE, no dice.
I tried X11 + XFCE4 next, it didn’t crash for a while, until it did. So we have eliminated DE and X11/Wayland so far.
I also tried an older kernel, 6.6 LTS. No difference, crashes. nvidia-open-dkms or not, doesn’t matter.

May be a regression, I didn’t have the issue until the 565.77 driver.

This bug being ongoing for 2 years is crazy.

1 Like

Hello again, folks!
I love y’all, don’t get me wrong, but I’m just really not excited to be joining you on this thread again :p

I’m once again having a problem with the exact same symptoms in a different game, except nothing interesting showing up on dmesg.

Driver 565.77 here, with Shadow of The Tomb Raider. I updated my system today, so I’m on that driver and kernel 6.12.4, but the problem also happened on LTS kernel (I was there since a previous nvidia driver version was broken on 6.12).

Oddly enough, I was able to get quite advanced on the game before this happening for the first time and now it just keeps happening (maybe I updated the drivers? Idk honestly but I don’t remember doing so. My flake.lock says dec 6 was the last commit on one of its inputs (the other ones having older commits) and I started Shadow on dec 7, but idk exactly when I ran that update since I only made a commit with that today. The problem did not happen on Rise or in tomb raider (2013) tho, which I played before running that update.).
The game freezes, but the game’s audio keeps working.

Very similar to the issue that was affecting Witcher 3 like a year ago, except nothing shows up on dmesg this time.

I’m launching Shadow of The Tomb Raider via Steam, as a non-steam game added by heroic games (ugh, I got it for free on epic but otherwise I could probably be happily playing it rn on the native version if I had bought it from Steam).

Potentially related to this, maybe: 560 release feedback & discussion - #329 by fdelente

Dmesg shows nothing interesting, running steam from terminal also shows nothing, and tomb raiders logs seem to not say nothing interesting either. Still, adding all of those logs here.

dmesg-tombraider-freeze.txt (132.4 KB)

Shadow of the Tomb Raider.log (30.5 KB)

steam-logs-tombraider-freeze.tar.gz (8.6 MB)

And, as @amrits requested last time I commented here, nvidia-bug-report.log.gz (411.2 KB)

I changed distros since last time I had a similar issue (with cyberpunk and witcher), and now since I use NixOS (as opposed to Tumbleweed) good news is you can probably get a system exactly like mine on the same exact versions of software and drivers I’m on (due to the flake.lock!) if that’d help you reproducing! (host is Luana-X670E) (On a NixOS system, cloning that repo, checking out the commit you want and doing sudo nixos-rebuild --flake .#Luana-X670E boot --impure followed by a reboot should put you into a system identical to mine! (Depending on your cpu threads:RAM proportion, you might need to add -j 4 --cores 8 to that command if Nix is compiling too many stuff and eating your whole RAM. If you don’t like Nix compiling all of that instead of grabbing from cache, well I don’t like it either but I use my system with CUDA support and packages compiled with cuda support don’t get to the cache. That’s on NVidia for making CUDA proprietary so you’ll just need to drink a bit of your own poison if you wanna go that path when trying to reproduce (sorry, just couldn’t lose the opportunity to say this lmao xD. I know that you specifically, person trying to reproduce this, has nothing to do with nvidias shitty corporate decisions regarding not open sourcing cuda). Should only take long on the 1st time tho, nix will reuse the local cache whenever possible.)):

Since this problem didn’t happen until reaching this part in the game, I’m assuming that the area might have something to do with it. Due to that, here’s my save file which might help in attempts to reproduce it:
ShadowSave.tar.gz (822.1 KB) (y’all would have saved around 200kb of space on this if you allowed me to upload a 7z btw :p). The freeze happens when playing for a while in that area.

I’ll rollback my system to my previous update before that dec 6(?) one (dec 3, it seems) to see if the problem still happens there (edit: yes it does, so not really related to latest driver probs). Anyway, I believe you should have enough info to reproduce this now. If you need any extra info, please let me know!!

(yes, I use parenthesis (and parenthesis inside of parenthesis) way too much. Sorry if that makes my writing hard to understand)

I seem to be able to reproduce that the freeze will ALWAYS happen if you go here, chill a bit close to the fire, and then move the other way around. The freeze will always happen here for me when I start moving away, before the camera can even rotate fully

Do note that the freeze has happened other times on other places on this area too, more randomly. Falling down the pit, pausing and reloading a checkpoint before that awful animation finishes playing and then just walking around a bit seems to be another quite reliable way of triggering this, tho the chilling around the fire method seems to be a more reliable reproducer.

Not sure if this matters for reproducibility, but I’m using a dualsense controller, launching the game from bigpicture as a non-steam game added by heroic games, using proton experimental i think.

Edit: this can also be reproduced on 560.35.03 and kernel 6.11.8, so I didn’t get this issue before either for pure luck or because it’s related to the game area and not to a driver version.

Edit 2: yeah, it has to do with this area. I was able to fast travel out of this area and then kept doing the main quest instead, and at least for now the freeze didn’t happen again.

Elden Ring game plays normally (launched from steam) until I get to Limgrave and then it always freezes the screen (sometimes immediately and sometimes after a minute or two), the music continues and I have to force quit. Dmesg:
[ 125.734356] NVRM: GPU at PCI:0000:0b:00: GPU-084af9c9-4830-49dc-03da-f0c329a2220b
[ 125.734362] NVRM: Xid (PCI:0000:0b:00): 109, pid=2725, name=eldenring.exe, Ch 00000022, errorString CTX SWITCH TIMEOUT, Info 0x8c021

My system is as follows:

Alienware aurora r10 w/ AMD Ryzen 7 5800, Nvidia RTX 3070 (OEM Asus)
Dist kernel 6.6.62
Nvidia-drivers 550.135 and also happens with new 565.77.

I’ve just bought Elden Ring: it crashes too! Now almost all the games…

2024-12-24T00:25:24.014357+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Warp Exception on (GPC 0, TPC 0, SM 0): Out Of Range Register
2024-12-24T00:25:24.014358+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Global Exception on (GPC 0, TPC 0, SM 0): Multiple Warp Errors
2024-12-24T00:25:24.014358+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics Exception: ESR 0x504730=0x2000d 0x504734=0x4 0x504728=0x7c12b72 0x50472c=0x104
2024-12-24T00:25:24.014359+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Warp Exception on (GPC 0, TPC 0, SM 1): Out Of Range Register
2024-12-24T00:25:24.014360+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Global Exception on (GPC 0, TPC 0, SM 1): Multiple Warp Errors
2024-12-24T00:25:24.014360+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics Exception: ESR 0x5047b0=0xd 0x5047b4=0x4 0x5047a8=0x7c12b72 0x5047ac=0x104
2024-12-24T00:25:24.014362+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Warp Exception on (GPC 0, TPC 1, SM 0): Out Of Range Register
2024-12-24T00:25:24.014362+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Global Exception on (GPC 0, TPC 1, SM 0): Multiple Warp Errors
2024-12-24T00:25:24.014362+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics Exception: ESR 0x504f30=0x1000d 0x504f34=0x4 0x504f28=0x7c12b72 0x504f2c=0x104
2024-12-24T00:25:24.015359+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Warp Exception on (GPC 0, TPC 1, SM 1): Out Of Range Register
2024-12-24T00:25:24.015361+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Global Exception on (GPC 0, TPC 1, SM 1): Multiple Warp Errors
2024-12-24T00:25:24.015361+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics Exception: ESR 0x504fb0=0x2000d 0x504fb4=0x4 0x504fa8=0x7c12b72 0x504fac=0x104
2024-12-24T00:25:24.015363+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Warp Exception on (GPC 0, TPC 2, SM 0): Out Of Range Register
2024-12-24T00:25:24.015363+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Global Exception on (GPC 0, TPC 2, SM 0): Multiple Warp Errors
2024-12-24T00:25:24.015363+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics Exception: ESR 0x505730=0x1000d 0x505734=0x4 0x505728=0x7c12b72 0x50572c=0x104
2024-12-24T00:25:24.015364+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Warp Exception on (GPC 0, TPC 2, SM 1): Out Of Range Register
2024-12-24T00:25:24.015365+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Global Exception on (GPC 0, TPC 2, SM 1): Multiple Warp Errors
2024-12-24T00:25:24.015365+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics Exception: ESR 0x5057b0=0xd 0x5057b4=0x4 0x5057a8=0x7c12b72 0x5057ac=0x104
2024-12-24T00:25:24.016330+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Warp Exception on (GPC 0, TPC 3, SM 0): Out Of Range Register
2024-12-24T00:25:24.016331+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Global Exception on (GPC 0, TPC 3, SM 0): Multiple Warp Errors
2024-12-24T00:25:24.016332+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics Exception: ESR 0x505f30=0x2000d 0x505f34=0x4 0x505f28=0x7c12b72 0x505f2c=0x104
2024-12-24T00:25:24.016333+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Warp Exception on (GPC 0, TPC 3, SM 1): Out Of Range Register
2024-12-24T00:25:24.016333+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Global Exception on (GPC 0, TPC 3, SM 1): Multiple Warp Errors
2024-12-24T00:25:24.016333+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics Exception: ESR 0x505fb0=0x2000d 0x505fb4=0x4 0x505fa8=0x7c12b72 0x505fac=0x104
2024-12-24T00:25:24.017333+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Warp Exception on (GPC 0, TPC 4, SM 0): Out Of Range Register
2024-12-24T00:25:24.017335+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Global Exception on (GPC 0, TPC 4, SM 0): Multiple Warp Errors
2024-12-24T00:25:24.017336+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics Exception: ESR 0x506730=0xd 0x506734=0x4 0x506728=0x7c12b72 0x50672c=0x104
2024-12-24T00:25:24.017336+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Warp Exception on (GPC 0, TPC 4, SM 1): Out Of Range Register
2024-12-24T00:25:24.017337+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Global Exception on (GPC 0, TPC 4, SM 1): Multiple Warp Errors
2024-12-24T00:25:24.017337+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics Exception: ESR 0x5067b0=0xd 0x5067b4=0x4 0x5067a8=0x7c12b72 0x5067ac=0x104
2024-12-24T00:25:24.017337+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Warp Exception on (GPC 1, TPC 0, SM 0): Out Of Range Register
2024-12-24T00:25:24.017338+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Global Exception on (GPC 1, TPC 0, SM 0): Multiple Warp Errors
2024-12-24T00:25:24.017338+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics Exception: ESR 0x50c730=0xd 0x50c734=0x4 0x50c728=0x7c12b72 0x50c72c=0x104
2024-12-24T00:25:24.018331+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Warp Exception on (GPC 1, TPC 0, SM 1): Out Of Range Register
2024-12-24T00:25:24.018332+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Global Exception on (GPC 1, TPC 0, SM 1): Multiple Warp Errors
2024-12-24T00:25:24.018333+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics Exception: ESR 0x50c7b0=0x2000d 0x50c7b4=0x4 0x50c7a8=0x7c12b72 0x50c7ac=0x104
2024-12-24T00:25:24.018333+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Warp Exception on (GPC 1, TPC 1, SM 0): Out Of Range Register
2024-12-24T00:25:24.018334+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Global Exception on (GPC 1, TPC 1, SM 0): Multiple Warp Errors
2024-12-24T00:25:24.018334+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics Exception: ESR 0x50cf30=0xd 0x50cf34=0x4 0x50cf28=0x7c12b72 0x50cf2c=0x104
2024-12-24T00:25:24.018335+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Warp Exception on (GPC 1, TPC 1, SM 1): Out Of Range Register
2024-12-24T00:25:24.018335+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Global Exception on (GPC 1, TPC 1, SM 1): Multiple Warp Errors
2024-12-24T00:25:24.018335+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics Exception: ESR 0x50cfb0=0x3000d 0x50cfb4=0x4 0x50cfa8=0x7c12b72 0x50cfac=0x104
2024-12-24T00:25:24.019334+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Warp Exception on (GPC 1, TPC 2, SM 0): Out Of Range Register
2024-12-24T00:25:24.019336+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Global Exception on (GPC 1, TPC 2, SM 0): Multiple Warp Errors
2024-12-24T00:25:24.019337+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics Exception: ESR 0x50d730=0xd 0x50d734=0x4 0x50d728=0x7c12b72 0x50d72c=0x104
2024-12-24T00:25:24.019338+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Warp Exception on (GPC 1, TPC 2, SM 1): Out Of Range Register
2024-12-24T00:25:24.019338+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Global Exception on (GPC 1, TPC 2, SM 1): Multiple Warp Errors
2024-12-24T00:25:24.019338+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics Exception: ESR 0x50d7b0=0x3000d 0x50d7b4=0x4 0x50d7a8=0x7c12b72 0x50d7ac=0x104
2024-12-24T00:25:24.019339+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Warp Exception on (GPC 1, TPC 3, SM 0): Out Of Range Register
2024-12-24T00:25:24.019339+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Global Exception on (GPC 1, TPC 3, SM 0): Multiple Warp Errors
2024-12-24T00:25:24.020338+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics Exception: ESR 0x50df30=0x2000d 0x50df34=0x4 0x50df28=0x7c12b72 0x50df2c=0x104
2024-12-24T00:25:24.020340+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Warp Exception on (GPC 1, TPC 3, SM 1): Out Of Range Register
2024-12-24T00:25:24.020340+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Global Exception on (GPC 1, TPC 3, SM 1): Multiple Warp Errors
2024-12-24T00:25:24.020341+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics Exception: ESR 0x50dfb0=0x1000d 0x50dfb4=0x4 0x50dfa8=0x7c12b72 0x50dfac=0x104
2024-12-24T00:25:24.020342+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Warp Exception on (GPC 1, TPC 4, SM 0): Out Of Range Register
2024-12-24T00:25:24.020343+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Global Exception on (GPC 1, TPC 4, SM 0): Multiple Warp Errors
2024-12-24T00:25:24.020343+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics Exception: ESR 0x50e730=0xd 0x50e734=0x4 0x50e728=0x7c12b72 0x50e72c=0x104
2024-12-24T00:25:24.021336+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Warp Exception on (GPC 1, TPC 4, SM 1): Out Of Range Register
2024-12-24T00:25:24.021338+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Global Exception on (GPC 1, TPC 4, SM 1): Multiple Warp Errors
2024-12-24T00:25:24.021338+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics Exception: ESR 0x50e7b0=0xd 0x50e7b4=0x4 0x50e7a8=0x7c12b72 0x50e7ac=0x104
2024-12-24T00:25:24.021339+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Warp Exception on (GPC 2, TPC 0, SM 0): Out Of Range Register
2024-12-24T00:25:24.021339+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Global Exception on (GPC 2, TPC 0, SM 0): Multiple Warp Errors
2024-12-24T00:25:24.021340+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics Exception: ESR 0x514730=0xd 0x514734=0x4 0x514728=0x7c12b72 0x51472c=0x104
2024-12-24T00:25:24.021340+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Warp Exception on (GPC 2, TPC 0, SM 1): Out Of Range Register
2024-12-24T00:25:24.021347+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Global Exception on (GPC 2, TPC 0, SM 1): Multiple Warp Errors
2024-12-24T00:25:24.021347+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics Exception: ESR 0x5147b0=0x1000d 0x5147b4=0x4 0x5147a8=0x7c12b72 0x5147ac=0x104
2024-12-24T00:25:24.022337+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Warp Exception on (GPC 2, TPC 1, SM 0): Out Of Range Register
2024-12-24T00:25:24.022341+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Global Exception on (GPC 2, TPC 1, SM 0): Multiple Warp Errors
2024-12-24T00:25:24.022341+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics Exception: ESR 0x514f30=0x1000d 0x514f34=0x4 0x514f28=0x7c12b72 0x514f2c=0x104
2024-12-24T00:25:24.022342+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Warp Exception on (GPC 2, TPC 1, SM 1): Out Of Range Register
2024-12-24T00:25:24.022342+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Global Exception on (GPC 2, TPC 1, SM 1): Multiple Warp Errors
2024-12-24T00:25:24.022342+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics Exception: ESR 0x514fb0=0x2000d 0x514fb4=0x4 0x514fa8=0x7c12b72 0x514fac=0x104
2024-12-24T00:25:24.023336+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Warp Exception on (GPC 2, TPC 2, SM 0): Out Of Range Register
2024-12-24T00:25:24.023339+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Global Exception on (GPC 2, TPC 2, SM 0): Multiple Warp Errors
2024-12-24T00:25:24.023339+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics Exception: ESR 0x515730=0xd 0x515734=0x4 0x515728=0x7c12b72 0x51572c=0x104
2024-12-24T00:25:24.023340+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Warp Exception on (GPC 2, TPC 2, SM 1): Out Of Range Register
2024-12-24T00:25:24.023340+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Global Exception on (GPC 2, TPC 2, SM 1): Multiple Warp Errors
2024-12-24T00:25:24.023341+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics Exception: ESR 0x5157b0=0x3000d 0x5157b4=0x4 0x5157a8=0x7c12b72 0x5157ac=0x104
2024-12-24T00:25:24.023341+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Warp Exception on (GPC 2, TPC 3, SM 0): Out Of Range Register
2024-12-24T00:25:24.023341+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Global Exception on (GPC 2, TPC 3, SM 0): Multiple Warp Errors
2024-12-24T00:25:24.023342+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics Exception: ESR 0x515f30=0xd 0x515f34=0x4 0x515f28=0x7c12b72 0x515f2c=0x104
2024-12-24T00:25:24.024337+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Warp Exception on (GPC 2, TPC 3, SM 1): Out Of Range Register
2024-12-24T00:25:24.024340+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Global Exception on (GPC 2, TPC 3, SM 1): Multiple Warp Errors
2024-12-24T00:25:24.024341+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics Exception: ESR 0x515fb0=0xd 0x515fb4=0x4 0x515fa8=0x7c12b72 0x515fac=0x104
2024-12-24T00:25:24.024341+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Warp Exception on (GPC 2, TPC 4, SM 0): Out Of Range Register
2024-12-24T00:25:24.024342+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Global Exception on (GPC 2, TPC 4, SM 0): Multiple Warp Errors
2024-12-24T00:25:24.024343+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics Exception: ESR 0x516730=0xd 0x516734=0x4 0x516728=0x7c12b72 0x51672c=0x104
2024-12-24T00:25:24.025337+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Warp Exception on (GPC 2, TPC 4, SM 1): Out Of Range Register
2024-12-24T00:25:24.025341+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Global Exception on (GPC 2, TPC 4, SM 1): Multiple Warp Errors
2024-12-24T00:25:24.025342+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics Exception: ESR 0x5167b0=0x3000d 0x5167b4=0x4 0x5167a8=0x7c12b72 0x5167ac=0x104
2024-12-24T00:25:24.026331+01:00 ordicommun kernel: NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics Exception: ChID 0018, Class 0000c597, Offset 00000000, Data 00000000

I’ve been expiriencing a similar issue, what seems intersting is that after setting up the following kernel parameters
pci=nomsi clearcpuid=514 nvidia_drm.fbdev=1 loglevel=4 nvidia-drm.modeset=1 nvidia-drm.fbdev=1 nvidia.NVreg_OpenRmEnableUnsupportedGpus=1
and playing FFXVI doesn’t seem to cause the game to crash.
nvidia-bug-report.log.tar.gz (528.1 KB)
I’m running nixOS as well here’s my config.
edit:
Here you can find the gpu metrics as well, you can see in the spikes when I started playing the game and when I stopped

a new update oh this, after upgrading to the kernel version 6.12.8, the issue still appeared, I now believe the xid 109 error is caused by the greeter, being able to launch my desktop environment from a tty instead of using sddm seems like a workaround for this issue. here’s another bug report log.
nvidia-bug-report.log.gz (728.0 KB)
and another screenshot from grafana


the first dip found at around 19:00 is when the error appeared, after that I stopped the display-manager service and launched hyprland from the tty, which allowed me to play 2h at least which is when I closed the game
edit:
with the xid 109 looks like the game segfaults exitting with signal 11
Process 10253 (ffxvi.exe) of user 1000 dumped core.
this is the stack strace:
Stack trace of thread 10658:
#0 0x00007f72e4c0bcda n/a (n/a + 0x0)
#1 0x00007f72e4c0da61 n/a (n/a + 0x0)
#2 0x00007f72e6440a70 n/a (n/a + 0x0)
ELF object binary architecture: AMD x86-64

Hi @shelter @Lifeismana @tellusEidolon
Please help to confirm if you are still experiencing xid 109 error with ACValhalla on driver 570.86.16

@amrits
For me it seems to work, RTX 4070. I have finished the game more or less so I did not do any longer testing. But I cranked up the resolution scale to 200% and ran around a bit, also traveled. I also tried with resolution scale 100% for a while. (The reason I cranked up the resolution scale was because an old issue with the driver was that it crashed with XID 109 when the GPU got above >98% usage when playing, it did not crash now, i was at 99% usage at times).