515.43.04 - Xid 13 in Metro Exodus Enhanced Edition

Fedora 36, kernel 5.17.11, driver from rpmfusion (for F37, but that shouldn’t make any difference - binary portion is the same, kmod is rebuilt locally with akmod)

510.68.02 works fine. 510.73.05 hasn’t been packaged by rpmfusion yet, hence not tested.

515.43.04 however has an issue with only Metro Exodus Enhanced Edition (the RTX-only re-release of Metro Exodus) on Steam (tried different Proton and Proton-GE versions as well). Some of my other games work, CUDA works, but this single game hangs after causing a Xid 13, only with this driver version.

[  154.343014] NVRM: GPU at PCI:0000:01:00: GPU-178fa3da-a238-320f-5615-f4d2ad51aaca
[  154.343018] NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Warp Exception on (GPC 2, TPC 5, SM 0): Illegal Instruction Parameter
[  154.343027] NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Global Exception on (GPC 2, TPC 5, SM 0): Multiple Warp Errors
[  154.343035] NVRM: Xid (PCI:0000:01:00): 13, pid='<unknown>', name=<unknown>, Graphics Exception: ESR 0x516f30=0xa000b 0x516f34=0x4 0x516f28=0xe812b60 0x516f2c=0x1174
[  154.343485] NVRM: Xid (PCI:0000:01:00): 13, pid=4363, name=MetroExodus.exe, Graphics Exception: ChID 002e, Class 0000c797, Offset 00000000, Data 00000000

This happens every time, right after the game loads and allows me to skip the intro and go into the menu. Screen goes black, whole desktop hangs for a short moment, then the game consumes 100% of a single core and needs to be killed, meanwhile a Xid error awaits in dmesg.

After such crash+kill, it asks whether to enable “safe mode”. In this safe mode the game starts and runs, but after going back to 1080p and reapplying the settings, it will eventually cause the same Xid error in a cutscene, or if I quit, every start (not in safe mode) will cause it just like before.

nvidia-bug-report.log.gz (364.3 KB)

I’m happy to report that it doesn’t happen in 515.48.07 anymore.