565 release feedback & discussion

That doesn’t work. Or failure to clean up completely left overs. In fact I am crazy angry at nvidia that they treat system as a garbage collector.
They must fully clean up system, not just driver, but dkms, /lib/modules etc, everything that left behind their driver.

1 Like

Just encountered error while trying to suspend:

nvidia-modeset: ERROR: GPU:0: Failed to query display engine channel state: 0x0000c67d:0:0:0x00000011
(a lot of such messages)
nvidia-modeset: ERROR: GPU:0: Timed out waiting for core channel idle.

then this repeats multiple times:

NVRM: _issueRpcAndWait: rpcSendMessage failed with status 0x00000011 for fn 10!
[164168.368257] NVRM: rpcRmApiFree_GSP: GspRmFree failed: hClient=0xc1d00001; hObject=0x00010067; paramsStatus=0x00000000; status=0x00000011
[164168.368259] NVRM: nvAssertFailedNoLog: Assertion failed: status == NV_OK @ rs_client.c:843
[164168.368266] NVRM: nvAssertFailedNoLog: Assertion failed: status == NV_OK @ rs_server.c:258

Finally this:

nvidia-modeset: ERROR: GPU:0: Failed to tear down Disp
[164168.373663] ------------[ cut here ]------------
[164168.373664] WARNING: CPU: 12 PID: 188260 at /tmp/akmodsbuild.ZGp8k8vQ/BUILD/nvidia-open-kmod-565.57.01-build/nvidia-open-kmod-565.57.01/_kmod_build_6.11.8-300.fc41.x86_64/kernel-open/nvidia/nv.c:4384 nv_set_system_power_state+0x2f2/0x480 [nvidia]
[164168.373796] Modules linked in: uas usb_storage nvidia_uvm(O) rfcomm snd_seq_dummy snd_hrtimer tun nvidia_drm(O) nvidia_modeset(O) nvidia(O) uhid nf_nat_tftp nf_nat_ftp nft_masq nf_conntrack_sane nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_tftp nf_conntrack_ftp nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat ebtable_nat ebtable_broute ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security nf_tables ip_set vboxnetadp(O) vboxnetflt(O) ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables vboxdrv(O) qrtr bnep snd_ctl_led snd_soc_skl_hda_dsp snd_soc_hdac_hdmi snd_sof_probes snd_soc_intel_hda_dsp_common binfmt_misc vfat fat snd_hda_codec_realtek snd_hda_codec_generic snd_hda_scodec_component snd_soc_dmic snd_sof_pci_intel_cnl snd_sof_intel_hda_generic soundwire_intel soundwire_cadence
[164168.373829]  snd_sof_intel_hda_common snd_sof_intel_hda_mlink snd_sof_intel_hda snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_sof_utils soundwire_generic_allocation soundwire_bus snd_soc_avs snd_soc_hda_codec iwlmvm snd_soc_skl snd_soc_hdac_hda snd_hda_ext_core snd_soc_sst_ipc snd_soc_sst_dsp intel_uncore_frequency snd_soc_acpi_intel_match intel_uncore_frequency_common intel_tcc_cooling snd_soc_acpi x86_pkg_temp_thermal intel_powerclamp mac80211 snd_soc_core snd_hda_codec_hdmi snd_compress ac97_bus snd_pcm_dmaengine coretemp snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi kvm_intel uvcvideo snd_hda_codec uvc dell_pc videobuf2_vmalloc mei_hdcp mei_pxp libarc4 ee1004 platform_profile snd_hda_core videobuf2_memops snd_hwdep kvm intel_rapl_msr videobuf2_v4l2 processor_thermal_device_pci_legacy processor_thermal_device snd_seq processor_thermal_wt_hint btusb videobuf2_common snd_seq_device spi_nor dell_laptop processor_thermal_rfim btrtl iwlwifi btintel videodev iTCO_wdt mtd processor_thermal_rapl snd_pcm r8169 btbcm
[164168.373868]  intel_pmc_bxt iTCO_vendor_support mei_me intel_rapl_common btmtk dell_wmi rapl dell_smbios bluetooth mc snd_timer dcdbas intel_cstate cfg80211 intel_uncore alienware_wmi dell_wmi_sysman dell_wmi_descriptor intel_wmi_thunderbolt pcspkr firmware_attributes_class wmi_bmof dell_smm_hwmon thunderbolt snd processor_thermal_wt_req i2c_i801 processor_thermal_power_floor spi_intel_pci mei realtek processor_thermal_mbox drm_ttm_helper soundcore rfkill i2c_smbus spi_intel intel_soc_dts_iosf intel_pch_thermal idma64 int3403_thermal int340x_thermal_zone joydev intel_pmc_core intel_vsec int3400_thermal pmt_telemetry intel_hid pmt_class acpi_thermal_rel acpi_pad acpi_tad sparse_keymap loop dm_multipath nfnetlink i915 i2c_algo_bit drm_buddy ttm nvme crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni drm_display_helper polyval_generic nvme_core ghash_clmulni_intel hid_multitouch sha512_ssse3 ucsi_acpi sha256_ssse3 sha1_ssse3 typec_ucsi cec nvme_auth typec video i2c_hid_acpi i2c_hid wmi pinctrl_cannonlake serio_raw
[164168.373904]  scsi_dh_rdac scsi_dh_emc scsi_dh_alua fuse i2c_dev
[164168.373908] CPU: 12 UID: 0 PID: 188260 Comm: nvidia-sleep.sh Tainted: G        W  O       6.11.8-300.fc41.x86_64 #1
[164168.373911] Tainted: [W]=WARN, [O]=OOT_MODULE
[164168.373911] Hardware name: Dell Inc. Dell G15 5510/0H0F1D, BIOS 1.27.0 08/06/2024
[164168.373913] RIP: 0010:nv_set_system_power_state+0x2f2/0x480 [nvidia]
[164168.373992] Code: ad c8 06 00 00 48 85 ed 75 d3 e9 d9 fd ff ff 0f 0b e9 13 fe ff ff 48 8b 3d e3 a0 3d 00 4c 89 f6 e8 a3 81 50 e8 e9 12 ff ff ff <0f> 0b 4c 89 ef e8 d4 78 2b e9 4d 85 f6 74 0d e8 2a dd 11 00 84 c0
[164168.373994] RSP: 0018:ffffafd98187ba30 EFLAGS: 00010206
[164168.373995] RAX: 0000000000000011 RBX: 0000000000000001 RCX: 0000000000000000
[164168.373996] RDX: 0000000000000087 RSI: 0000000000000282 RDI: ffff93c052c89688
[164168.373997] RBP: 0000000000000011 R08: 0000000000000000 R09: ffffafd98187b80c
[164168.373998] R10: ffffffffc27182b0 R11: ffffffffc282de7e R12: ffff93c04da88000
[164168.373999] R13: ffff93c04da886a0 R14: 0000000000000000 R15: 0000000000000000
[164168.374000] FS:  00007f58df501740(0000) GS:ffff93c3bf600000(0000) knlGS:0000000000000000
[164168.374002] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[164168.374003] CR2: 00007f58df6ed650 CR3: 0000000103fa8003 CR4: 00000000007706f0
[164168.374004] PKRU: 55555554
[164168.374005] Call Trace:
[164168.374007]  <TASK>
[164168.374007]  ? nv_set_system_power_state+0x2f2/0x480 [nvidia]
[164168.374085]  ? __warn.cold+0x8e/0xe8
[164168.374089]  ? nv_set_system_power_state+0x2f2/0x480 [nvidia]
[164168.374168]  ? report_bug+0xff/0x140
[164168.374170]  ? handle_bug+0x58/0x90
[164168.374172]  ? exc_invalid_op+0x17/0x70
[164168.374173]  ? asm_exc_invalid_op+0x1a/0x20
[164168.374175]  ? nv_set_system_power_state+0x2f2/0x480 [nvidia]
[164168.374253]  nv_procfs_write_suspend+0x105/0x1b0 [nvidia]
[164168.374335]  proc_reg_write+0x5a/0xa0
[164168.374338]  vfs_write+0xfb/0x460
[164168.374341]  ? syscall_exit_to_user_mode+0x10/0x220
[164168.374344]  ksys_write+0x6d/0xf0
[164168.374346]  do_syscall_64+0x82/0x160
[164168.374348]  ? _copy_to_user+0x36/0x50
[164168.374350]  ? cp_new_stat+0x131/0x170
[164168.374354]  ? __do_sys_newfstat+0x68/0x70
[164168.374355]  ? syscall_exit_to_user_mode+0x10/0x220
[164168.374358]  ? do_syscall_64+0x8e/0x160
[164168.374360]  ? filp_close+0x1d/0x30
[164168.374362]  ? do_dup2+0xad/0x130
[164168.374364]  ? syscall_exit_to_user_mode+0x10/0x220
[164168.374366]  ? do_syscall_64+0x8e/0x160
[164168.374368]  ? __count_memcg_events+0x75/0x130
[164168.374370]  ? count_memcg_events.constprop.0+0x1a/0x30
[164168.374372]  ? handle_mm_fault+0x21b/0x330
[164168.374374]  ? do_user_addr_fault+0x55a/0x7b0
[164168.374377]  ? clear_bhb_loop+0x25/0x80
[164168.374378]  ? clear_bhb_loop+0x25/0x80
[164168.374380]  ? clear_bhb_loop+0x25/0x80
[164168.374381]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[164168.374384] RIP: 0033:0x7f58df5eb904
[164168.374398] Code: c7 00 16 00 00 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 80 3d 65 87 10 00 00 74 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 55 48 89 e5 48 83 ec 20 48 89
[164168.374399] RSP: 002b:00007ffdbc0fcba8 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
[164168.374401] RAX: ffffffffffffffda RBX: 0000000000000008 RCX: 00007f58df5eb904
[164168.374402] RDX: 0000000000000008 RSI: 0000563f69690180 RDI: 0000000000000001
[164168.374402] RBP: 00007ffdbc0fcbd0 R08: 0000000000000410 R09: 0000000000000001
[164168.374403] R10: 0000000000000004 R11: 0000000000000202 R12: 0000000000000008
[164168.374404] R13: 0000563f69690180 R14: 00007f58df6ed5c0 R15: 00007f58df6eae80
[164168.374406]  </TASK>
[164168.374406] ---[ end trace 0000000000000000 ]---

After that GPU is unresponsive

Dell G15 5510 laptop with RTX 3060

nvidia-bug-report after reboot, could not get without reboot - the script just hangs
nvidia-bug-report.log.gz (1.0 MB)

565 is still in beta, so if I would have to guess we could see the 570 beta in January 2025, but I’m not a developer I’m just looking at the usual time span between every driver release for Linux

Can someone from Nvidia file a bug on this: Stutering and low fps scrolling in browsers on Wayland when GSP firmware is enabled ?

Pretty annoying. Disabling GSP is not an option on nvidia-open.

1 Like

Hmm I see 550 driver version update released today with kernel fixes. Please roll out subversion too for latest beta. I need kernel 6.12 compatibility. Thank you in advance.

Unfortunately GSP was firstly developed for compute workload and at current state, GSP is not ready for graphic applications. There is too much latency between the GPS chip and the host system and the Nvidia DRM driver doesn’t handle that well.

You mean it cannot be fixed or what? Windows hasn’t this problem and AFAIK it also uses GSP.

1 Like

This also effects the closed driver, so those of us who don’t want to use the vomit-comet that is GSP will have to wait for a bump in the beta driver before jumping onto 6.12.

565 Beta version 2 coming with 6.12 support and Multi Monitor VRR Clueless

1 Like

Windows hasn’t this problem and AFAIK it also uses GSP.

The Windows driver is not using GSP, you can check by typing nvidia-smi -q.

You mean it cannot be fixed or what?

It can be fixed but for now GSP is not ready for desktop use and this is why GSP is disabled in the Windows driver.

When?

Nobody knows. I was joking, it will not be in 565, expect it in 570.

It is actually bad if no new beta coming any few days soon, as 565 not working with 6.12 and 550 is useless, laggy, buggy etc for wayland

A lot of people complaining (and I do understand them), but all I see here is:

  • A thread dedicated to Linux testers to help the developers keep on improving the 565 driver (and 570 in the future).

  • Developers answering whenever they can to help along. Them being silence, me as a developer, would take that as them working in a dedicated fashion towards it and not putting time in responding to people that do not contribute literally the above statement for the thread (feedback).

  • Some contributors have done a FANTASTIC job, like @faz who basically single-handedly has contribute more to the proper testing of this driver than everybody on the thread.

  • This is the same thing that happened on 535, 550, 560 and will continue to be done in a way were the testers can be part of the development process.

  • Last feedback from an Nvidia employee was literally this same month. Would be totally different if it was on January but come on, just 2 weeks ago. Not easy looking for bugs reported here and trying to ensure they get fixed before the release of the driver. They have even mentioned which things will be fixed in 570 because there is not enough time to do them on 565. Being honest in this development process is critical for me and they have done so with full transparency.

It would be MUCH different if Nvidia did not ask for support and help from the community. No feedback option, no suggestions, etc… Even worse, did not allow users to provide any type of feedback.

Just my $29.99 of cents about this along with thanking Nvidia to at least changing a lot in the past years to provide better engagement and support for the Linux community as a whole.

4 Likes

Thanks, I checked, you are right! Interesting, why nvidia-open doesn’t allow to disable GSP when?

565 has problems running Cyberpunk with NVAPI enabled.

I had turned on DLSS and Frame Gen on using 555 drivers and then after upgrading to 565, I tried to launch the game, and it sat and spun forever without ever showing a window. I checked the proton log and saw it stall out here:

29533.801:0128:012c:trace:loaddll:build_module Loaded L"C:\\ProgramData\\NVIDIA\\NGX\\models\\dlss\\versions\\131844\\files\\160_B9DB490.bin" at 00006FFFEDA90000: native
29533.803:0128:012c:trace:loaddll:free_modref Unloaded module L"C:\\ProgramData\\NVIDIA\\NGX\\models\\dlss\\versions\\131844\\files\\160_B9DB490.bin" : native
29533.804:0128:012c:fixme:cryptasn:CryptDecodeObjectEx Unsupported decoder for lpszStructType 1.3.6.1.4.1.311.2.1.4
29538.802:0128:0260:err:sync:RtlpWaitForCriticalSection section 00006FFFFFFB1380 "../src-wine/dlls/ntdll/loader.c: loader_section" wait timed out in thread 0260, blocked by 026c, retrying (60 sec)
29538.802:0128:026c:err:sync:RtlpWaitForCriticalSection section 00006FFFF1B24FE0 "?" wait timed out in thread 026c, blocked by 0260, retrying (60 sec)
29538.802:0128:0264:err:sync:RtlpWaitForCriticalSection section 00006FFFF1B24FE0 "?" wait timed out in thread 0264, blocked by 0260, retrying (60 sec)
29538.802:0128:0268:err:sync:RtlpWaitForCriticalSection section 00006FFFF1B24FE0 "?" wait timed out in thread 0268, blocked by 0260, retrying (60 sec)

which seems to point very much to some bad DLSS interactions.

I then restarted the game with PROTON_DISABLE_NVAPI=1 and started fine. I went into the settings and changed the scaling and frame gen settings to None, and then tried running again without NVAPI disabled, and it stalled out in exactly the same way.

Because NVIDIA is using GSP to hide prorietary code.

1 Like

But in Linux it is being pushed to be used, even though it’s giving more problems that solves problems. Nice…

I think NVIDIA is more fucued on AI workload than desktop/gaming. There are also anonying issues affecting the Windows driver such as backscreen/freezes with DSC.

1 Like

Yeah, that’s unfortunately true. Their desktop GPUs income is like single digit percent. 90-something % is AI ;(

1 Like