565 release feedback & discussion

Changing the backlight value is still broken/does nothing with 565.57.01 on a Clevo P751DM2-G.

But what do I expect, it has only been broken for 3 years now, starting with 465.24.02.

My external monitor keeps freezing whenever I run a D3D12 game (in that case FH4).
Iā€™m using optimus-manager, and set it as hybrid mode. The game is running on the NVIDIA GPU.
I either have to restart my computer, or go to a TTY then go back to my desktop to have it working again.
RTX4070 + AMD Ryzen 7 8845HS on EndeavourOS.
Note that the integrated display of my laptop is disabled, I only use the external one.

nvidia-bug-report.log.gz (1.8 MB)

edit: thx @jtellm, adding VKD3D_DISABLE_EXTENSIONS=VK_KHR_present_wait lets me run the game. External display will still randomly freeze, or whenever I alt-tab out of the game.

Yeah, they release a ā€œBETAā€ driver, and ignore almost all feedback. But thatā€™s nothing new unfortunately.
We are paying customers and have worse feedback from a trillion dollars corporation, than a random hobbyist project on Github. Sad.

Iā€™ve been able to use the driver for a bit, and it worked perfectly in the beginning, but now that iā€™ve updated to the latest linux kernel, iā€™m only able to boot with the LTS kernel, and all wayland window managers are either completely broken (Hyprland doesnā€™t work), or mostly unusable (kwin, it does some weird stuff with my mouse cursor being duplicated, and the center of the mouse cursor being way below the actual mouse cursor. Also low fps on the desktop without anything else opened).

can confirm, it is fixed when I use electron-nightly

Even with kernel 6.12 I am still seeing suspend fail when the idle timeout expires. It seems gnome-shell refuses to be frozen:

nov 30 18:22:04 fedoracosta systemd[1]: nvidia-suspend.service: Deactivated successfully.
nov 30 18:22:04 fedoracosta systemd[1]: Finished nvidia-suspend.service - NVIDIA system suspend actions.
nov 30 18:22:04 fedoracosta audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=nvidia-suspend comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=succ>
nov 30 18:22:04 fedoracosta audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=nvidia-suspend comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=succe>
nov 30 18:22:04 fedoracosta systemd[1]: Starting systemd-suspend.service - System Suspend...
nov 30 18:22:04 fedoracosta systemd-sleep[27980]: User sessions remain unfrozen on explicit request ($SYSTEMD_SLEEP_FREEZE_USER_SESSIONS=0).
nov 30 18:22:04 fedoracosta systemd-sleep[27980]: This is not recommended, and might result in unexpected behavior, particularly
nov 30 18:22:04 fedoracosta systemd-sleep[27980]: in suspend-then-hibernate operations or setups with encrypted home directories.
nov 30 18:22:04 fedoracosta systemd-sleep[27980]: Performing sleep operation 'suspend'...
nov 30 18:22:04 fedoracosta kernel: PM: suspend entry (deep)
nov 30 18:22:04 fedoracosta kernel: Filesystems sync: 0.113 seconds
nov 30 18:22:24 fedoracosta kernel: Freezing user space processes
nov 30 18:22:24 fedoracosta kernel: Freezing user space processes failed after 20.001 seconds (1 tasks refusing to freeze, wq_busy=0):
nov 30 18:22:24 fedoracosta kernel: task:gnome-shell     state:R  running task     stack:0     pid:2433  tgid:2433  ppid:2292   flags:0x0000400e
nov 30 18:22:24 fedoracosta kernel: Call Trace:
nov 30 18:22:24 fedoracosta kernel:  <TASK>
nov 30 18:22:24 fedoracosta kernel:  ? _nv027804rm+0x70/0x70 [nvidia]
nov 30 18:22:24 fedoracosta kernel:  ? srso_return_thunk+0x5/0x5f
nov 30 18:22:24 fedoracosta kernel:  ? os_get_current_tick+0x3b/0xa0 [nvidia]

nvidia-bug-report.log.gz (469.6 KB)

Oddly enough, I donā€™t have any problems if I manually suspend the system.

Random GPU hang (canā€™t even reboot properly) when trying to suspend (proprietary modules, GSP disabled, RTX 3060 laptop):

[95042.977205] BUG: kernel NULL pointer dereference, address: 0000000000000085
[95042.977210] fbcon: Taking over console
[95042.977212] #PF: supervisor read access in kernel mode
[95042.977214] #PF: error_code(0x0000) - not-present page
[95042.977215] PGD 0 P4D 0 
[95042.977217] Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
[95042.977220] CPU: 14 UID: 0 PID: 37600 Comm: kworker/14:0 Tainted: P           O       6.11.10-300.fc41.x86_64 #1
[95042.977223] Tainted: [P]=PROPRIETARY_MODULE, [O]=OOT_MODULE
[95042.977224] Hardware name: Dell Inc. Dell G15 5510/0H0F1D, BIOS 1.28.0 10/01/2024
[95042.977226] Workqueue: kacpi_notify acpi_os_execute_deferred
[95042.977230] RIP: 0010:_nv014851rm+0x2a4/0xab0 [nvidia]
[95042.977540] Code: ff 00 00 00 0f 84 ac 00 00 00 44 3b 7d 10 74 16 41 83 c7 01 41 83 ff ff 0f 85 a8 fe ff ff 0f 1f 84 00 00 00 00 00 48 8b 45 18 <80> b8 85 00 00 00 00 0f 85 9f 06 00 00 8b 45 20 8b 0d ee d5 df 00
[95042.977543] RSP: 0018:ffffb3a00c60bcb8 EFLAGS: 00010046
[95042.977545] RAX: 0000000000000000 RBX: ffffffffc2b6c880 RCX: 000f561f101d7240
[95042.977546] RDX: 000000000000000d RSI: 0000000000000000 RDI: 0000000003dfd240
[95042.977548] RBP: ffff9ce232a65f50 R08: 0000000000000001 R09: 0000000000000020
[95042.977549] R10: ffff9ce232a65f1c R11: ffff9ce57f7217c0 R12: 0000000000000001
[95042.977551] R13: 0000000000000000 R14: ffff9ce204a70008 R15: 0000000000000000
[95042.977552] FS:  0000000000000000(0000) GS:ffff9ce57f700000(0000) knlGS:0000000000000000
[95042.977554] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[95042.977556] CR2: 0000000000000085 CR3: 000000046442a005 CR4: 00000000007706f0
[95042.977558] PKRU: 55555554
[95042.977559] Call Trace:
[95042.977561]  <TASK>
[95042.977563]  ? __die_body.cold+0x19/0x27
[95042.977566]  ? page_fault_oops+0x15a/0x2f0
[95042.977570]  ? exc_page_fault+0x7e/0x180
[95042.977572]  ? asm_exc_page_fault+0x26/0x30
[95042.977575]  ? _nv014851rm+0x2a4/0xab0 [nvidia]
[95042.977779]  ? _nv014851rm+0x1f6/0xab0 [nvidia]
[95042.977982]  _nv049277rm+0x110/0x170 [nvidia]
[95042.978222]  _nv015333rm+0x1f8/0x390 [nvidia]
[95042.978405]  _nv054253rm+0x89/0x1c0 [nvidia]
[95042.978587]  rm_acpi_notify+0x1e3/0x280 [nvidia]
[95042.978829]  acpi_ev_notify_dispatch+0x48/0x80
[95042.978831]  acpi_os_execute_deferred+0x17/0x30
[95042.978834]  process_one_work+0x176/0x330
[95042.978837]  worker_thread+0x252/0x390
[95042.978840]  ? __pfx_worker_thread+0x10/0x10
[95042.978842]  kthread+0xcf/0x100
[95042.978844]  ? __pfx_kthread+0x10/0x10
[95042.978846]  ret_from_fork+0x31/0x50
[95042.978848]  ? __pfx_kthread+0x10/0x10
[95042.978850]  ret_from_fork_asm+0x1a/0x30
[95042.978853]  </TASK>
[95042.978854] Modules linked in: rfcomm snd_seq_dummy snd_hrtimer tun nf_nat_tftp nf_nat_ftp nft_masq nf_conntrack_sane nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_tftp nf_conntrack_ftp nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject vboxnetadp(O) vboxnetflt(O) nft_ct vboxdrv(O) nft_chain_nat nvidia_drm(PO) nvidia_modeset(PO) ebtable_nat ebtable_broute ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security nvidia_uvm(PO) nf_tables ip_set ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables qrtr nvidia(PO) uhid bnep snd_ctl_led snd_soc_skl_hda_dsp snd_soc_hdac_hdmi snd_soc_intel_hda_dsp_common snd_sof_probes binfmt_misc vfat fat snd_hda_codec_realtek snd_hda_codec_generic snd_hda_scodec_component snd_soc_dmic snd_sof_pci_intel_cnl snd_sof_intel_hda_generic soundwire_intel soundwire_cadence snd_sof_intel_hda_common
[95042.978883]  snd_sof_intel_hda_mlink snd_sof_intel_hda snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_sof_utils soundwire_generic_allocation soundwire_bus snd_soc_avs iwlmvm snd_soc_hda_codec snd_soc_skl snd_soc_hdac_hda snd_hda_ext_core snd_soc_sst_ipc snd_soc_sst_dsp intel_uncore_frequency snd_soc_acpi_intel_match intel_uncore_frequency_common snd_soc_acpi intel_tcc_cooling mac80211 snd_soc_core snd_hda_codec_hdmi snd_compress x86_pkg_temp_thermal intel_powerclamp ac97_bus snd_pcm_dmaengine snd_hda_intel snd_intel_dspcfg coretemp dell_pc snd_intel_sdw_acpi platform_profile iTCO_wdt snd_hda_codec kvm_intel spi_nor libarc4 snd_hda_core intel_pmc_bxt intel_rapl_msr dell_laptop mei_hdcp mei_pxp ee1004 mtd iTCO_vendor_support snd_hwdep kvm uvcvideo btusb uvc videobuf2_vmalloc btrtl dell_wmi videobuf2_memops processor_thermal_device_pci_legacy processor_thermal_device snd_seq btintel videobuf2_v4l2 processor_thermal_wt_hint btbcm snd_seq_device videobuf2_common processor_thermal_rfim btmtk rapl processor_thermal_rapl iwlwifi
[95042.978919]  videodev snd_pcm r8169 intel_rapl_common bluetooth dell_smbios mc intel_cstate dell_wmi_sysman snd_timer dcdbas cfg80211 intel_uncore alienware_wmi pcspkr dell_smm_hwmon firmware_attributes_class dell_wmi_descriptor thunderbolt mei_me processor_thermal_wt_req snd i2c_i801 spi_intel_pci wmi_bmof processor_thermal_power_floor intel_wmi_thunderbolt processor_thermal_mbox mei realtek spi_intel soundcore i2c_smbus drm_ttm_helper rfkill idma64 intel_soc_dts_iosf intel_pch_thermal int3403_thermal int340x_thermal_zone intel_pmc_core joydev intel_vsec pmt_telemetry intel_hid int3400_thermal pmt_class acpi_pad acpi_tad acpi_thermal_rel sparse_keymap loop dm_multipath nfnetlink i915 nvme i2c_algo_bit crct10dif_pclmul drm_buddy crc32_pclmul nvme_core crc32c_intel ttm polyval_clmulni polyval_generic drm_display_helper ghash_clmulni_intel hid_multitouch sha512_ssse3 sha256_ssse3 ucsi_acpi sha1_ssse3 typec_ucsi nvme_auth cec i2c_hid_acpi typec i2c_hid video wmi pinctrl_cannonlake serio_raw scsi_dh_rdac scsi_dh_emc
[95042.978959]  scsi_dh_alua fuse i2c_dev
[95042.978969] CR2: 0000000000000085
[95042.978971] ---[ end trace 0000000000000000 ]---
[95042.978972] RIP: 0010:_nv014851rm+0x2a4/0xab0 [nvidia]
[95042.979199] Code: ff 00 00 00 0f 84 ac 00 00 00 44 3b 7d 10 74 16 41 83 c7 01 41 83 ff ff 0f 85 a8 fe ff ff 0f 1f 84 00 00 00 00 00 48 8b 45 18 <80> b8 85 00 00 00 00 0f 85 9f 06 00 00 8b 45 20 8b 0d ee d5 df 00
[95042.979202] RSP: 0018:ffffb3a00c60bcb8 EFLAGS: 00010046
[95042.979204] RAX: 0000000000000000 RBX: ffffffffc2b6c880 RCX: 000f561f101d7240
[95042.979205] RDX: 000000000000000d RSI: 0000000000000000 RDI: 0000000003dfd240
[95042.979207] RBP: ffff9ce232a65f50 R08: 0000000000000001 R09: 0000000000000020
[95042.979208] R10: ffff9ce232a65f1c R11: ffff9ce57f7217c0 R12: 0000000000000001
[95042.979210] R13: 0000000000000000 R14: ffff9ce204a70008 R15: 0000000000000000
[95042.979211] FS:  0000000000000000(0000) GS:ffff9ce57f700000(0000) knlGS:0000000000000000
[95042.979213] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[95042.979215] CR2: 0000000000000085 CR3: 000000046442a005 CR4: 00000000007706f0
[95042.979216] PKRU: 55555554
[95042.979217] note: kworker/14:0[37600] exited with irqs disabled
[95042.979224] note: kworker/14:0[37600] exited with preempt_count 1
[95042.993453] Console: switching to colour frame buffer device 240x67
[104523.827783] perf: interrupt took too long (2509 > 2500), lowering kernel.perf_event_max_sample_rate to 79000
[134328.792753] EDID block 0 is all zeroes
[134329.276793] EDID block 0 is all zeroes

osu-lazer (opengl) freezes when direct scanout is triggered. here is the bug report from repro state:
nvidia-bug-report.log.gz (1.4 MB)

All i did was launch osu-lazer, tab out, go back in, and my entire main monitor was frozen

I have the problem consistently with Nioh 2 and only Nioh 2 (freeze after 1/4 h or when a special shader is used, I donā€™t know). A friend of mine I was co-oping could play without freezes, but all of a sudden it started happening to him too. Both of use just use Steam+Nioh 2, and we didnā€™t use any special settings that Iā€™m aware of. I can screenshot and post my in-game settings if you want.

Iā€™m having really bad performance on linux.

I was testing The Finals but even at lowest settings the game is unplayable, i have stutter and frame drops.

  • Tested with Vsync: On, Frame Rate Limit:140 and with Vsync Off, Frame Rate Limit: Off
  • Dlss: Ultra Performance
  • Reflex: Off
  • Motion Blur and Lens Distortion: Disabled
  • NVIDIA RTX Global Illumination: Static
  • All graphics settings at minimum.

Recording:

Spec:
CPU: 5800X3D
GPU: RTX 3080 10Gb
Driver: 565.57.01 (closed kernel module with NVreg_EnableGpuFirmware=0)
RAM: 32Gb
RES: 3440x1440

With the 560 driver version, i definitely had a lot less of these problems.

Tested on Vanilla Arch and CachyOS with Hyprland, Gnome and KDE, X11/Xwayland, DX11/DX12 with more or less the same behaviors (Also in other games).

@fdelente

This may be unrelated to 565 drivers, but Iā€™ve been doing some more testing for Nioh 2, and I could reproduce the 565 freeze, even on 550, but only when using gamescope (version 3.15.14). Maybe itā€™s a coincidence, but itā€™s the exact same behavior as playing with the newer drivers.

With these params (base gamescope without any upscaling), NO FREEZE, I played for about 45 minutes and it was very stable:

gamescope -- %command%

With these params (lower resolution with basic upscaler + sharpness), I freeze 1 to 5 minutes after starting a mission:

gamescope -h 720 -F fsr --sharpness 0 -- %command%

(for some reason it seems that --sharpness 0 is the highest sharpness and 20 is the lowest)

If I manually turn on FSR or NIS during a mission, mess around with the sharpness, it eventually triggers the freeze

gamescope -- %command%
...
[during a mission]
SUPER + U [Enable FSR]
SUPER + Y [Enable NIS]
SUPER + I, SUPER + O [-/+ sharpness]

Maybe this specific game really doesnā€™t like being injected with any kind of post-processing, I donā€™t know if the newer drivers do something like that by default, I did have some custom Profiles in nvidia-settings with some sharpness, but they are completely disabled (and also still there on 550, which didnā€™t cause an issue, I played for 6 hours yesterday without a single freeze)

The freezes donā€™t seem to happen if I just stay in the menus (or maybe they take a longer time to happen and it didnā€™t happen to me)

Just throwing everything at the wall seeing what sticks right now

$ nvidia-smi 
Mon Dec  2 21:37:30 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 565.57.01              Driver Version: 565.57.01      CUDA Version: 12.7     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA T1000 8GB               On  |   00000000:01:00.0  On |                  N/A |
| 45%   62C    P5             N/A /   50W |    1255MiB /   8192MiB |      2%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A       728      G   /usr/lib/Xorg                                 492MiB |
|    0   N/A  N/A       920      G   xfwm4                                           2MiB |
|    0   N/A  N/A      1072      G   /usr/lib/thunderbird/thunderbird              158MiB |
|    0   N/A  N/A      5950      G   /usr/lib/firefox/firefox                      566MiB |
+-----------------------------------------------------------------------------------------+

$ uname -a
Linux hostname 6.12.1-arch1-1 #1 SMP PREEMPT_DYNAMIC Fri, 22 Nov 2024 16:04:27 +0000 x86_64 GNU/Linux

$ journalctl -b-1 | grep NVRM
Dec 02 21:30:53 hostname kernel: NVRM: GPU at PCI:0000:01:00: GPU-aa1f5ee0-9eb3-c7d8-d810-8a7063429e2a
Dec 02 21:30:53 hostname kernel: NVRM: GPU Board Serial Number: 1420222065051
Dec 02 21:30:53 hostname kernel: NVRM: Xid (PCI:0000:01:00): 62, pid='<unknown>', name=<unknown>, 00011ed7 00011f33 00011a64 0001598b 00015d96 00013d55 00000011 00000000
Dec 02 21:30:53 hostname kernel: NVRM: Xid (PCI:0000:01:00): 154, pid='<unknown>', name=<unknown>, GPU recovery action changed from 0x0 (None) to 0x1 (GPU Reset Required)
Dec 02 21:31:03 hostname kernel: NVRM: Xid (PCI:0000:01:00): 119, pid=923, name=xfwm4, Timeout after 6s of waiting for RPC response from GPU0 GSP! Expected function 10 (FREE) (0xcaf00003 0x0).
Dec 02 21:31:03 hostname kernel: NVRM: GPU0 GSP RPC buffer contains function 10 (FREE) and data 0x00000000caf00003 0x0000000000000000.
Dec 02 21:31:03 hostname kernel: NVRM: GPU0 RPC history (CPU -> GSP):
Dec 02 21:31:03 hostname kernel: NVRM:     entry function                   data0              data1              ts_start           ts_end             duration actively_polling
Dec 02 21:31:03 hostname kernel: NVRM:      0    10   FREE                  0x00000000caf00003 0x0000000000000000 0x00062856f83ee77e 0x0000000000000000          y
Dec 02 21:31:03 hostname kernel: NVRM:     -1    76   GSP_RM_CONTROL        0x00000000c3700104 0x0000000000000014 0x00062856f7878595 0x00062856f78786e6    337us
Dec 02 21:31:03 hostname kernel: NVRM:     -2    76   GSP_RM_CONTROL        0x00000000c3700104 0x0000000000000014 0x00062856f78782d6 0x00062856f7878562    652us
Dec 02 21:31:03 hostname kernel: NVRM:     -3    76   GSP_RM_CONTROL        0x00000000c3700104 0x0000000000000014 0x00062856f7877e8f 0x00062856f7878294   1029us
Dec 02 21:31:03 hostname kernel: NVRM:     -4    76   GSP_RM_CONTROL        0x00000000c3700104 0x0000000000000014 0x00062856f7870850 0x00062856f7870a97    583us
Dec 02 21:31:03 hostname kernel: NVRM:     -5    76   GSP_RM_CONTROL        0x00000000c3700104 0x0000000000000014 0x00062856f78704d3 0x00062856f78706a3    464us
Dec 02 21:31:03 hostname kernel: NVRM:     -6    76   GSP_RM_CONTROL        0x00000000c3700104 0x0000000000000014 0x00062856f786fc0a 0x00062856f786fd1c    274us
Dec 02 21:31:03 hostname kernel: NVRM:     -7    76   GSP_RM_CONTROL        0x00000000c3700104 0x0000000000000014 0x00062856f786f95f 0x00062856f786fbaa    587us
Dec 02 21:31:03 hostname kernel: NVRM: GPU0 RPC event history (CPU <- GSP):
Dec 02 21:31:03 hostname kernel: NVRM:     entry function                   data0              data1              ts_start           ts_end             duration during_incomplete_rpc
Dec 02 21:31:03 hostname kernel: NVRM:      0    4130 RECOVERY_ACTION       0x0000000000000000 0x0000000000000000 0x00062856f801c40a 0x00062856f801c40e      4us
Dec 02 21:31:03 hostname kernel: NVRM:     -1    4102 OS_ERROR_LOG          0x0000000000000000 0x0000000000000000 0x00062856f801c3fc 0x00062856f801c409     13us
Dec 02 21:31:03 hostname kernel: NVRM:     -2    4128 GSP_POST_NOCAT_RECORD 0x0000000000000003 0x0000000000011ed7 0x00062856f801c3f9 0x00062856f801c3fb      2us
Dec 02 21:31:03 hostname kernel: NVRM:     -3    4128 GSP_POST_NOCAT_RECORD 0x0000000000000005 0x00000285056e35f0 0x000628533c3034c5 0x000628533c3034c7      2us
Dec 02 21:31:03 hostname kernel: NVRM:     -4    4128 GSP_POST_NOCAT_RECORD 0x0000000000000002 0x0000000000000025 0x000628533c2c7556 0x000628533c2c7559      3us
Dec 02 21:31:03 hostname kernel: NVRM:     -5    4099 POST_EVENT            0x0000000000000001 0x0000000000000000 0x000628533c2c7547 0x000628533c2c7556     15us
Dec 02 21:31:03 hostname kernel: NVRM:     -6    4128 GSP_POST_NOCAT_RECORD 0x0000000000000005 0x00000285056e35f0 0x000628533c26ca7a 0x000628533c26ca7b      1us
Dec 02 21:31:03 hostname kernel: NVRM:     -7    4128 GSP_POST_NOCAT_RECORD 0x0000000000000005 0x000003750561b994 0x000628533c26c365 0x000628533c26c365
Dec 02 21:31:09 hostname kernel: NVRM: Xid (PCI:0000:01:00): 119, pid=923, name=xfwm4, Timeout after 6s of waiting for RPC response from GPU0 GSP! Expected function 76 (GSP_RM_CONTROL) (0x20800a6a 0x0).
Dec 02 21:31:15 hostname kernel: NVRM: Xid (PCI:0000:01:00): 119, pid=923, name=xfwm4, Timeout after 6s of waiting for RPC response from GPU0 GSP! Expected function 10 (FREE) (0xbeef0403 0x0).
Dec 02 21:31:21 hostname kernel: NVRM: Rate limiting GSP RPC error prints for GPU at PCI:0000:01:00 (printing 1 of every 30).  The GPU likely needs to be reset.

$ lsusb 
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 001 Device 002: ID 0b05:1a5c ASUSTek Computer, Inc. USB Audio
Bus 001 Device 003: ID 0b05:19af ASUSTek Computer, Inc. AURA LED Controller
Bus 001 Device 004: ID 1a40:0101 Terminus Technology Inc. Hub
Bus 001 Device 005: ID 0489:e0e2 Foxconn / Hon Hai Wireless_Device
Bus 001 Device 006: ID 0557:8021 ATEN International Co., Ltd Hub
Bus 001 Device 007: ID 17f6:0822 Unicomp, Inc. Ruffian 6 Keyboard v3 [Model M]
Bus 001 Device 008: ID 0557:2276 ATEN International Co., Ltd KVM V1.2.116

$ cat /etc/X11/xorg.conf
# nvidia-xconfig: X configuration file generated by nvidia-xconfig
# nvidia-xconfig:  version 530.41.03


Section "ServerLayout"
    Identifier     "Layout0"
    Screen      0  "Screen0" 0 0
    InputDevice    "Keyboard0" "CoreKeyboard"
EndSection

Section "Files"
EndSection

Section "InputDevice"

    # generated from default
    Identifier     "Keyboard0"
    Driver         "kbd"
EndSection

Section "Monitor"
    Identifier     "Monitor0"
    VendorName     "Unknown"
    ModelName      "Unknown"
    Option         "DPMS"
EndSection

Section "Device"
    Identifier     "Device0"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
    BoardName      "NVIDIA T1000 8GB"
    #Option         "ModeValidation" "AllowNonEdidModes"
    #Option         "TripleBuffer" "1"
    #Option         "Coolbits" "4"
EndSection

Section "Screen"
    Identifier     "Screen0"
    Device         "Device0"
    Monitor        "Monitor0"
    DefaultDepth    24
    SubSection     "Display"
        Depth       24
    EndSubSection
EndSection

$ cat /proc/cpuinfo 
processor	: 0
vendor_id	: AuthenticAMD
cpu family	: 25
model		: 97
model name	: AMD Ryzen 5 7600 6-Core Processor
stepping	: 2
microcode	: 0xa601206
cpu MHz		: 545.000
cache size	: 1024 KB
physical id	: 0
siblings	: 12
core id		: 0
cpu cores	: 6
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 16
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good amd_lbr_v2 nopl xtopology nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba perfmon_v2 ibrs ibpb stibp ibrs_enhanced vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local user_shstk avx512_bf16 clzero irperf xsaveerptr rdpru wbnoinvd cppc arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic vgif x2avic v_spec_ctrl vnmi avx512vbmi umip pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid overflow_recov succor smca fsrm flush_l1d amd_lbr_pmc_freeze
bugs		: sysret_ss_attrs spectre_v1 spectre_v2 spec_store_bypass srso
bogomips	: 7588.35
TLB size	: 3584 4K pages
clflush size	: 64
cache_alignment	: 64
address sizes	: 48 bits physical, 48 bits virtual
power management: ts ttp tm hwpstate cpb eff_freq_ro [13] [14]

$ cat /proc/meminfo 
MemTotal:       65553244 kB
MemFree:        60368044 kB
MemAvailable:   62450072 kB
Buffers:          175708 kB
Cached:          2464848 kB
SwapCached:            0 kB

Randomly freezes.

Hi, I would like to provide new information about bug ID [4825128] (freezing in games), and also complement my old report:

In the last few weeks, I had the opportunity to test the nvidia 565 further and better understand whether the freezing would still persist or not.

In general, in my Arch Linux - XFCE4 - X11 environment, to play without any problems, I use the nvidia 555.58.02-16 drivers. I play on an external monitor, using optimus-manager in hybrid mode. All steam/proton games runs only in my dedicated nvidia GPU, and browser/discord/any other software runs mostly in my integrated intel GPU. Also, all the nvidia drivers that I have install and testing are from Arch Linux AUR (nvidia aur).

At the beginning of November 2024, I tested the nvidia 565.57.01-2 version and as I previously reported, it made the games freeze. And a quick way for me to check this behavior was through Euro Truck Simulator 2, where it would freeze moments before reaching the start menu.

Last week I was testing with nvidia 565.57.01-8 and to my surprise, the initial freezing that always happens in Euro Truck Simulator 2 no longer occurred. I tested this version more extensively by playing Metro 2033 Redux and Metro Last Light Redux and after hours, the games stills freeze (3 times in my tests) but only if I had something in the background using my GPUs, such as: streaming, video, discord call, etc. I highlight here that I played for long hours without any problems, and the three times it froze were when I was either in an audio call on discord or listening/watching a stream.

Furthermore, I didnā€™t have the chance to test in games in DX12 (vkd3d), I only tested in DX11 (dxvk) and all with Proton-Experimental [bleeding-edge].

Anyway, I donā€™t know if this information helps in any way, but I found the behavior of the driver versions in relation to this freezing problem curious. In short, this is what I have noticed happening in my environment:

  • nvidia 555.58.02-16: not freeze. Stable to play without any problems.
  • nvidia 565.57.01-2: freeze. Some games take longer to happen, but in Euro Truck Simulator 2 it is ā€œsimpleā€ to reproduce because it happens right from the start.
  • nvidia 565.57.01-8 freeze. But only if you are using the GPUs in the background (watch streaming, discord call).

No XID or other error message in dmesg log when the game freezes?

I tested the game, running on hyprland arch. Drivers 565 and runs smoothly on my nvidia [GeForce RTX 3060 Mobile / Max-Q]
Also protondb says it works.

Well, on protonDB other users have also reported stuttersā€¦

i can confirm that i also have really crap performance on the finals on linux. this was mentioned before also

Iā€™ve also started to experience gpu hangs and freezes. Not sure if itā€™s related to kernel version 6.12 or a recent update in pipewire. Usually thereā€™s an audio glitch seconds before.
These hangs/freezes are really bad, most times I need to shutdown forcefully with the power button.

Dec 03 12:59:03 hostname1 pipewire[1182]: pw.node: (alsa_output.pci-0000_0a_00.4.analog-surround-21-93) graph xrun not-triggered (352 suppressed)
Dec 03 12:59:03 hostname1 pipewire[1182]: pw.node: (alsa_output.pci-0000_0a_00.4.analog-surround-21-93) xrun state:0x7a9e15a95008 pending:1/2 s:4109389497543 a:4109392185351 f:4109392197821 waiting:2687808 process:12470 status:triggered
Dec 03 12:59:03 hostname1 pipewire[1182]: pw.node: (Rocket League-56) xrun state:0x7a9e172ed008 pending:0/1 s:4109392185351 a:4109342500139 f:4109342502899 waiting:18446744073659866404 process:2760 status:triggered
Dec 03 12:59:03 hostname1 kernel: NVRM: Xid (PCI:0000:08:00): 32, pid=17600, name=RocketLeague.ex, Channel ID 00000048 intr 00008000
Dec 03 12:59:03 hostname1 kernel: NVRM: Xid (PCI:0000:08:00): 32, pid=17600, name=RocketLeague.ex, Channel ID 00000048 intr 00008000
Dec 03 12:59:05 hostname1 kernel: NVRM: Xid (PCI:0000:08:00): 56, pid='<unknown>', name=<unknown>, CMDre 00000007 0000041c ffffffff 00000004 00800000
Dec 03 12:59:10 hostname1 kernel: NVRM: Xid (PCI:0000:08:00): 79, pid='<unknown>', name=<unknown>, GPU has fallen off the bus.
Dec 03 12:59:10 hostname1 kernel: NVRM: GPU 0000:08:00.0: GPU has fallen off the bus.
Dec 03 12:59:10 hostname1 kernel: NVRM: kgspRcAndNotifyAllChannels_IMPL: RC all channels for critical error 79.
Dec 03 12:59:10 hostname1 kernel: NVRM: _threadNodeCheckTimeout: API_GPU_ATTACHED_SANITY_CHECK failed!

nvidia-bug-report.log.gz (524.6 KB)

Hello. Is there any any ETA for never Beta, NVidia? Thanks in advance.

1 Like
2 Likes

Graphical bugs in Ubisoft games eg. ac mirage arenā€™t fixed in this releaseā€¦