This is kinda silly, 7 months without fix. It reminds me when we lost audio when frequency was above 60Hz, basically a year for the fix to come.
Day 3 now of running the new 525.116.04 drivers without a crash, touch wood. On the previous 530.30.02 I was getting 1-4 crashes a day. I am cautiously optimistic at this stage.
My 530.30.02 was installed automatically by the CUDA 12.1 installer. 525.116.04 appears to come bundled with CUDA 12.0, although the old 12.1 installation is still there and I can still build and run applications targeted at 12.1.
Iâve not tried any Windows games.
525.116.04 still affected.
Mai 15 22:13:05 kleinerpopel kernel: NVRM: GPU at PCI:0000:26:00: GPU-6f98b267-20cc-5347-51dc-8bad07fd2ad0
Mai 15 22:13:05 kleinerpopel kernel: NVRM: Xid (PCI:0000:26:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Warp Exception on (GPC 2, TPC 0, SM 1): Illegal Instruction Parameter
Mai 15 22:13:05 kleinerpopel kernel: NVRM: Xid (PCI:0000:26:00): 13, pid='<unknown>', name=<unknown>, Graphics Exception: ESR 0x5147b0=0xb 0x5147b4=0x0 0x5147a8=0xf812b60 0x5147ac=0x1104
Mai 15 22:13:09 kleinerpopel kernel: NVRM: Xid (PCI:0000:26:00): 109, pid=7052, name=MetroExodus.exe, Ch 00000036, errorString CTX SWITCH TIMEOUT, Info 0x8c01b
nvidia-bug-report.log.gz (777.6 KB)
525.47.24 , not fixedâŠ
Still not working, what a shame
Now on 6 days without a crash since installing 525.116.04. Looks like theyâve fixed the real problem here of drivers frequently hanging during general use of their latest notebook hardware. Thanks @amrits - although some communication would have been nice.
Those hoping that nvidia are going to expend any effort to make games only designed to run in Windows work on Linux are probably in for a long wait. Why should they bother? Whatâs so awful about dual booting?
Wrong, Nvidia patched a lot of issues with their Linux driver to make games running via Proton/Wine work properly including, but not limited to Doom Eternal, Cyberpunk and others.
Also why shouldnât I be allowed, who payed the same amount of money for a GPU as a Windows user, to receive the same level of customer support?
They offer Unix/Linux drivers and it is my right as a customer to request support if it is broken.
Also the driver is not used outside itâs specification as the issue seems to be related to a bug in their Vulkan and/or CUDA driver which they also officially support on Unix/Linux.
Furthermore there is a lot awful about dual booting, notably wasting at least 100GB on an extra partition for an OS I would not use except for like 2 games maybe.
Which is not a justification to go through all the dual boot hoops and dealing with random Windows messing up the bootloader configuration from time to time.
Not to speak about the anti customer, anti privacy and security issues which come along by using Windows.
It is my computer I payed for and I should have the freedom to use any OS on it I like and not being told by others to use something else. Or to run an OS which constantly reminds me on the fact thatâs no longer my PC but Microsoftâs.
Can you please clean install driver 525.116.04 and test again.
Fix is incorporated in this driver, the notebook where I had repro is working fine with driver 525.116.04
Can you please test with driver 525.116.04 and share test results.
Thanks for sharing the test results with driver 525.116.04.
Try and run steam metro exodus enhanced edition with 525.116.04. It crashes before the main menu. it worked correctly up to 515 .
Tested with 525.116.04 and Resident Evil 4 still crashes with XID 109.
Im currently using a Desktop RTX 2070 non Super and kernel 6.2 on EndeavourOS.
NVIDIA-SMI 525.116.04 Driver Version: 525.116.04 CUDA Version: 12.0
If I set the Graphics to medium/low or keep the GPU usage below 80% I can game for hours on end without issues, but if I remove the 60FPS cap and remove FSR2 the game crashes like 10 to 20 mins in.
Dmesg:
[ 1243.631777] NVRM: GPU at PCI:0000:01:00: GPU-4dab7fc5-927c-2242-fbe8-b8655f657b70
[ 1243.631780] NVRM: Xid (PCI:0000:01:00): 109, pid=3012, name=re4.exe, Ch 0000007e, errorString CTX SWITCH TIMEOUT, Info 0x1c03d
Journalctl:
mai 17 01:54:43 HellsGate kernel: NVRM: Xid (PCI:0000:01:00): 109, pid=186473, name=re4.exe, Ch 000000c6, errorString CTX SWITCH TIMEOUT, Info 0x68c05e
mai 17 01:57:58 HellsGate kernel: NVRM: Xid (PCI:0000:01:00): 109, pid=187111, name=re4.exe, Ch 00000096, errorString CTX SWITCH TIMEOUT, Info 0x2c058
mai 17 02:01:22 HellsGate kernel: NVRM: Xid (PCI:0000:01:00): 109, pid=187621, name=re4.exe, Ch 00000096, errorString CTX SWITCH TIMEOUT, Info 0x1c058
mai 17 02:10:22 HellsGate kernel: NVRM: Xid (PCI:0000:01:00): 109, pid=189123, name=re4.exe, Ch 000000c6, errorString CTX SWITCH TIMEOUT, Info 0x2c05e
mai 17 02:31:01 HellsGate kernel: NVRM: Xid (PCI:0000:01:00): 109, pid=190906, name=re4.exe, Ch 0000009e, errorString CTX SWITCH TIMEOUT, Info 0x1c05b
mai 17 03:15:45 HellsGate kernel: NVRM: Xid (PCI:0000:01:00): 109, pid=196266, name=re4.exe, Ch 0000009e, errorString CTX SWITCH TIMEOUT, Info 0x1c058
mai 17 03:54:58 HellsGate kernel: NVRM: Xid (PCI:0000:01:00): 109, pid=198235, name=re4.exe, Ch 0000009e, errorString CTX SWITCH TIMEOUT, Info 0x1c058
mai 17 15:56:07 HellsGate kernel: NVRM: Xid (PCI:0000:01:00): 109, pid=7121, name=re4.exe, Ch 0000007e, errorString CTX SWITCH TIMEOUT, Info 0x2c04c
mai 17 16:25:16 HellsGate kernel: NVRM: Xid (PCI:0000:01:00): 109, pid=9359, name=re4.exe, Ch 0000007e, errorString CTX SWITCH TIMEOUT, Info 0x1c04c
mai 17 17:03:32 HellsGate kernel: NVRM: Xid (PCI:0000:01:00): 109, pid=2546, name=re4.exe, Ch 00000096, errorString CTX SWITCH TIMEOUT, Info 0x1c052
mai 17 22:00:48 HellsGate kernel: NVRM: Xid (PCI:0000:01:00): 109, pid=21484, name=re4.exe, Ch 000000c6, errorString CTX SWITCH TIMEOUT, Info 0x12c06a
mai 18 17:13:32 HellsGate kernel: NVRM: Xid (PCI:0000:01:00): 109, pid=3012, name=re4.exe, Ch 0000007e, errorString CTX SWITCH TIMEOUT, Info 0x1c03d
All crashes present as Xid 109.
Itâs kinda turning me off from using Nvidia in the future ngl.
I am also affected by this issue. Itâs very anoying as it keeps me from doing work. I tried different driver versions and none of them is working properly.
[ 321.848872] NVRM: Xid (PCI:0000:01:00): 109, pid=11447, name=###, Ch 00000029, errorString CTX SWITCH TIMEOUT, Info 0x8c019
I removed the nvidia driver entirely from my system, rebooted and reinstalled it and rebooted again.
Nothing changed to my previous report.
I ran Metro Exodus in Safe Mode to advance to the menu, did all the settings as described here:
Re-start the game and the Xid 109 returned before I could see the game menu.
Mai 23 11:05:10 kleinerpopel kernel: NVRM: GPU at PCI:0000:26:00: GPU-6f98b267-20cc-5347-51dc-8bad07fd2ad0
Mai 23 11:05:10 kleinerpopel kernel: NVRM: Xid (PCI:0000:26:00): 13, pid='<unknown>', name=<unknown>, Graphics SM Warp Exception on (GPC 2, TPC 0, SM 1): Illegal Instruction Parameter
Mai 23 11:05:10 kleinerpopel kernel: NVRM: Xid (PCI:0000:26:00): 13, pid='<unknown>', name=<unknown>, Graphics Exception: ESR 0x5147b0=0x11000b 0x5147b4=0x0 0x5147a8=0xf812b60 0x5147ac=0x1104
Mai 23 11:05:15 kleinerpopel kernel: NVRM: Xid (PCI:0000:26:00): 109, pid=28930, name=MetroExodus.exe, Ch 0000002e, errorString CTX SWITCH TIMEOUT, Info 0x8c018
nvidia-bug-report.log.gz (801.0 KB)
Finally, beta 535.43.02 and Metro works again. Thank you
For me the Problem still persits with 535.43.02 :/
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.43.02 Driver Version: 535.43.02 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 2080 Off | 00000000:01:00.0 On | N/A |
| 25% 56C P0 56W / 215W | 1040MiB / 8192MiB | 10% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
[ 215.653000] NVRM: GPU at PCI:0000:01:00: GPU-cc1d3779-2cfa-1a03-c8c6-fd3541676d49
[ 215.653004] NVRM: Xid (PCI:0000:01:00): 109, pid=8611, name=XXX-Linux-Shi, Ch 00000031, errorString CTX SWITCH TIMEOUT, Info 0x8c017
[ 311.290140] NVRM: Xid (PCI:0000:01:00): 109, pid=8871, name=XXX-Linux-Shi, Ch 00000031, errorString CTX SWITCH TIMEOUT, Info 0x8c017
[ 335.297576] NVRM: Xid (PCI:0000:01:00): 109, pid=9025, name=XXX-Linux-Shi, Ch 00000031, errorString CTX SWITCH TIMEOUT, Info 0x8c017
I can provoke what seams that error when running Witcher 3 and enabling RT while using Lutris. Funny enough using the same proton version either and Lutris and Steam, if I use Steam to run Witcher 3 with RT it works fine with steam⊠?? I donât understand this one but I havenât dig why.
The issue still appears to be happening with 525.116.04. Wasteland 3 Iâve been playing for a little while now and on the older driver it had no crashes/issues. Newer driver, however, thereâs constant crashing. The steeltown final battle? I have not been able to finish due to length and the crashing.
[ 2320.298894] NVRM: GPU at PCI:0000:01:00: GPU-d94a8dd3-7c3d-ecc0-033e-1ec177ee9969
[ 2320.298897] NVRM: Xid (PCI:0000:01:00): 109, pid=3428, name=WL3.exe, Ch 00000016, errorString CTX SWITCH TIMEOUT, Info 0x1c010
As can be seen above, XID 109 is still alive and well.
Truly hope this gets fixed sometime soon.
Edit: Forgot to post nvidia-smi ouput
==============NVSMI LOG==============
Timestamp : Wed May 31 11:45:57 2023
Driver Version : 525.116.04
CUDA Version : 12.0Attached GPUs : 1
GPU 00000000:01:00.0
Product Name : NVIDIA GeForce GTX 1650
Product Brand : GeForce
Product Architecture : Turing
Display Mode : Disabled
Display Active : Disabled
Persistence Mode : Disabled
MIG Mode
Current : N/A
Pending : N/A
Accounting Mode : Disabled
Accounting Mode Buffer Size : 4000
Driver Model
Current : N/A
Pending : N/A
Serial Number : N/A
GPU UUID : GPU-d94a8dd3-7c3d-ecc0-033e-1ec177ee9969
Minor Number : 0
VBIOS Version : 90.17.4B.00.33
MultiGPU Board : No
Board ID : 0x100
Board Part Number : N/A
GPU Part Number : 1F99-753-A1
Module ID : 1
Inforom Version
Image Version : G001.0000.02.04
OEM Object : 1.1
ECC Object : N/A
Power Management Object : N/A
GPU Operation Mode
Current : N/A
Pending : N/A
GSP Firmware Version : N/A
GPU Virtualization Mode
Virtualization Mode : None
Host VGPU Mode : N/A
IBMNPU
Relaxed Ordering Mode : N/A
PCI
Bus : 0x01
Device : 0x00
Domain : 0x0000
Device Id : 0x1F9910DE
Bus Id : 00000000:01:00.0
Sub System Id : 0x170D1043
GPU Link Info
PCIe Generation
Max : 3
Current : 1
Device Current : 1
Device Max : 3
Host Max : 3
Link Width
Max : 16x
Current : 16x
Bridge Chip
Type : N/A
Firmware : N/A
Replays Since Reset : 0
Replay Number Rollovers : 0
Tx Throughput : 0 KB/s
Rx Throughput : 0 KB/s
Atomic Caps Inbound : N/A
Atomic Caps Outbound : N/A
Fan Speed : N/A
Performance State : P8
Clocks Throttle Reasons
Idle : Active
Applications Clocks Setting : Not Active
SW Power Cap : Not Active
HW Slowdown : Not Active
HW Thermal Slowdown : Not Active
HW Power Brake Slowdown : Not Active
Sync Boost : Not Active
SW Thermal Slowdown : Not Active
Display Clock Setting : Not Active
FB Memory Usage
Total : 4096 MiB
Reserved : 192 MiB
Used : 6 MiB
Free : 3896 MiB
BAR1 Memory Usage
Total : 256 MiB
Used : 4 MiB
Free : 252 MiB
Compute Mode : Default
Utilization
Gpu : 0 %
Memory : 0 %
Encoder : 0 %
Decoder : 0 %
Encoder Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
FBC Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
Ecc Mode
Current : N/A
Pending : N/A
ECC Errors
Volatile
SRAM Correctable : N/A
SRAM Uncorrectable : N/A
DRAM Correctable : N/A
DRAM Uncorrectable : N/A
Aggregate
SRAM Correctable : N/A
SRAM Uncorrectable : N/A
DRAM Correctable : N/A
DRAM Uncorrectable : N/A
Retired Pages
Single Bit ECC : N/A
Double Bit ECC : N/A
Pending Page Blacklist : N/A
Remapped Rows : N/A
Temperature
GPU Current Temp : 52 C
GPU T.Limit Temp : N/A
GPU Shutdown Temp : 99 C
GPU Slowdown Temp : 94 C
GPU Max Operating Temp : 87 C
GPU Target Temperature : N/A
Memory Current Temp : N/A
Memory Max Operating Temp : N/A
Power Readings
Power Management : Supported
Power Draw : 6.13 W
Power Limit : 50.00 W
Default Power Limit : 50.00 W
Enforced Power Limit : 50.00 W
Min Power Limit : 1.00 W
Max Power Limit : 50.00 W
Clocks
Graphics : 300 MHz
SM : 300 MHz
Memory : 405 MHz
Video : 540 MHz
Applications Clocks
Graphics : N/A
Memory : N/A
Default Applications Clocks
Graphics : N/A
Memory : N/A
Deferred Clocks
Memory : N/A
Max Clocks
Graphics : 1785 MHz
SM : 1785 MHz
Memory : 6001 MHz
Video : 1650 MHz
Max Customer Boost Clocks
Graphics : 1785 MHz
Clock Policy
Auto Boost : N/A
Auto Boost Default : N/A
Voltage
Graphics : N/A
Fabric
State : N/A
Status : N/A
Processes
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 1687
Type : G
Name : /usr/bin/Xorg.bin
Used GPU Memory : 4 MiB
you need 535.43.02 not 525
Iâm confusedâŠ
- Nvidia advised that the fix has been implemented in 525.116.04. Despite this, the issue persists.
- Simon.suckut literally just posted above your past post (here) advising that the issue still persists with 535.43.02.
I imagine at this point, some additional follow-up from NVIDIA is needed.