570 release feedback & discussion

Just got kernel oops. Seems to be related to nvidia driver:

Feb 16 00:53:26 pnb kernel: BUG: kernel NULL pointer dereference, address: 000000000000026c
Feb 16 00:53:26 pnb kernel: #PF: supervisor read access in kernel mode
Feb 16 00:53:26 pnb kernel: #PF: error_code(0x0000) - not-present page
Feb 16 00:53:26 pnb kernel: PGD 0 P4D 0 
Feb 16 00:53:26 pnb kernel: Oops: Oops: 0000 [#1] PREEMPT SMP
Feb 16 00:53:26 pnb kernel: CPU: 9 UID: 0 PID: 987727 Comm: Xorg Tainted: G        W  O       6.13.2-pnb #72
Feb 16 00:53:26 pnb kernel: Tainted: [W]=WARN, [O]=OOT_MODULE
Feb 16 00:53:26 pnb kernel: Hardware name: LENOVO 21D6CTO1WW/21D6CTO1WW, BIOS N3FET38W (1.23 ) 09/27/2023
Feb 16 00:53:26 pnb kernel: RIP: 0010:nvAssignDefaultUsageBounds+0xa8/0x1d0 [nvidia_modeset]
Feb 16 00:53:26 pnb kernel: Code: 43 0c 00 89 43 08 49 8b 84 24 10 0c 00 00 48 8b 80 98 01 00 00 48 85 c0 74 a0 4c 89 e7 e8 10 a5 fb ff ba 04 00 00 00 48 63 ca <8b> 7c c8 08 85 ff 74 60 48 8d 04 c8 0f b7 48 08 66 89 4b 04 0f b7
Feb 16 00:53:26 pnb kernel: RSP: 0018:ffff8881b1c9f800 EFLAGS: 00010206
Feb 16 00:53:26 pnb kernel: RAX: 0000000000000244 RBX: ffff88810a618058 RCX: 0000000000000004
Feb 16 00:53:26 pnb kernel: RDX: 0000000000000004 RSI: ffff88810a618040 RDI: ffffc90000189008
Feb 16 00:53:26 pnb kernel: RBP: ffffc900001896b8 R08: 0000000000000400 R09: 0000000000000400
Feb 16 00:53:26 pnb kernel: R10: 0000000000000000 R11: 0000000000000f00 R12: ffffc90000189008
Feb 16 00:53:26 pnb kernel: R13: ffff88810a618040 R14: ffff88810a618058 R15: ffffc90000189978
Feb 16 00:53:26 pnb kernel: FS:  00007f52a7342b00(0000) GS:ffff88901fc40000(0000) knlGS:0000000000000000
Feb 16 00:53:26 pnb kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb 16 00:53:26 pnb kernel: CR2: 000000000000026c CR3: 000000095af24006 CR4: 0000000000772ef0
Feb 16 00:53:26 pnb kernel: PKRU: 55555554
Feb 16 00:53:26 pnb kernel: Call Trace:
Feb 16 00:53:26 pnb kernel:  <TASK>
Feb 16 00:53:26 pnb kernel:  ? __die_body.cold+0x8/0x17
Feb 16 00:53:26 pnb kernel:  ? page_fault_oops+0x64/0x1b0
Feb 16 00:53:26 pnb kernel:  ? exc_page_fault+0x69/0xb0
Feb 16 00:53:26 pnb kernel:  ? asm_exc_page_fault+0x22/0x30
Feb 16 00:53:26 pnb kernel:  ? nvAssignDefaultUsageBounds+0xa8/0x1d0 [nvidia_modeset]
Feb 16 00:53:26 pnb kernel:  ? ConstructHwModeTimingsViewPort.constprop.0+0x117/0x160 [nvidia_modeset]
Feb 16 00:53:26 pnb kernel:  ? nvConstructHwModeTimingsEvo+0x387/0x7c0 [nvidia_modeset]
Feb 16 00:53:26 pnb kernel:  ? nvValidateModeForModeset+0x122/0x170 [nvidia_modeset]
Feb 16 00:53:26 pnb kernel:  ? rmapiControlWithSecInfoTls+0xa0/0xf0 [nvidia]
Feb 16 00:53:26 pnb kernel:  ? nvGetHwModeTimings+0x15e/0x220 [nvidia_modeset]
Feb 16 00:53:26 pnb kernel:  ? nvHsConfigInitModeset+0x2b8/0xab0 [nvidia_modeset]
Feb 16 00:53:26 pnb kernel:  ? nvSetDispModeEvo+0x465/0x4150 [nvidia_modeset]
Feb 16 00:53:26 pnb kernel:  ? io_flush_timeouts+0xe/0xd0
Feb 16 00:53:26 pnb kernel:  ? SetModePrepUser+0x33/0xf0 [nvidia_modeset]
Feb 16 00:53:26 pnb kernel:  ? Flip+0xf0/0xf0 [nvidia_modeset]
Feb 16 00:53:26 pnb kernel:  ? nvkms_alloc+0x78/0xa0 [nvidia_modeset]
Feb 16 00:53:26 pnb kernel:  ? Flip+0xf0/0xf0 [nvidia_modeset]
Feb 16 00:53:26 pnb kernel:  ? nvKmsIoctl+0xf2/0x240 [nvidia_modeset]
Feb 16 00:53:26 pnb kernel:  ? nvkms_unlocked_ioctl+0xc8/0x140 [nvidia_modeset]
Feb 16 00:53:26 pnb kernel:  ? __x64_sys_ioctl+0x75/0xa0
Feb 16 00:53:26 pnb kernel:  ? do_syscall_64+0x37/0xe0
Feb 16 00:53:26 pnb kernel:  ? entry_SYSCALL_64_after_hwframe+0x4b/0x53
Feb 16 00:53:26 pnb kernel:  </TASK>
Feb 16 00:53:26 pnb kernel: Modules linked in: udp_diag tcp_diag inet_diag nvidia_uvm(O) veth snd_usb_audio snd_usbmidi_lib snd_ump usbhid snd_rawmidi snd_seq_device rfcomm cmac algif_hash algif_skcipher af_alg nf_conntrack_netlink xfrm_user xfrm_algo br_netfilter bridge stp llc vmnet(O) vmw_vsock_vmci_transport vsock vmw_vmci vmmon(O) ctr ccm overlay at24 regmap_i2c ee1004 cpufreq_userspace cpufreq_ondemand cpufreq_conservative cpufreq_powersave bnep sunrpc nft_chain_nat xt_nat xt_MASQUERADE nf_nat xt_addrtype xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_CHECKSUM xt_tcpudp nft_compat binfmt_misc nf_tables libcrc32c nls_ascii nls_cp437 vfat fat snd_soc_skl_hda_dsp snd_soc_intel_sof_board_helpers snd_soc_intel_hda_dsp_common snd_sof_probes snd_hda_codec_realtek snd_hda_codec_generic snd_hda_scodec_component snd_soc_dmic snd_sof_pci_intel_tgl snd_sof_pci_intel_cnl snd_sof_intel_hda_generic soundwire_intel soundwire_generic_allocation soundwire_cadence snd_sof_intel_hda_common snd_soc_hdac_hda snd_sof_intel_hda_mlink
Feb 16 00:53:26 pnb kernel:  snd_sof_intel_hda snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_sof_utils snd_soc_acpi_intel_match snd_soc_acpi_intel_sdca_quirks snd_soc_acpi soundwire_bus snd_soc_sdca x86_pkg_temp_thermal intel_powerclamp snd_soc_avs snd_soc_hda_codec snd_hda_ext_core snd_hda_codec_hdmi snd_soc_core kvm_intel snd_compress nvidia_drm(O) kvm uvcvideo crct10dif_pclmul ghash_clmulni_intel videobuf2_vmalloc uvc sha512_ssse3 videobuf2_memops sha512_generic videobuf2_v4l2 snd_hda_intel sha256_ssse3 nvidia_modeset(O) snd_intel_dspcfg iwlmvm videodev sha1_ssse3 snd_ctl_led snd_intel_sdw_acpi videobuf2_common aesni_intel snd_hda_codec mc nvidia(O) mac80211 gf128mul processor_thermal_device_pci snd_hda_core libarc4 crypto_simd processor_thermal_device drm_ttm_helper ptp snd_hwdep cryptd processor_thermal_wt_hint ttm rapl thinkpad_acpi btusb pps_core snd_pcm processor_thermal_rfim mei_hdcp mei_pxp intel_cstate drm_client_lib nvram snd_timer platform_profile btrtl processor_thermal_rapl intel_rapl_msr ucsi_acpi drm_kms_helper
Feb 16 00:53:26 pnb kernel:  typec_ucsi iwlwifi nvidiafb tiny_power_button intel_rapl_common intel_uncore think_lmi btbcm vgastate processor_thermal_wt_req mei_me firmware_attributes_class roles fb_ddc snd wmi_bmof btmtk processor_thermal_power_floor int3400_thermal i2c_algo_bit processor_thermal_mbox int3403_thermal typec cfg80211 soundcore mei btintel acpi_thermal_rel intel_hid ac intel_vsec int340x_thermal_zone button acpi_pad sparse_keymap acpi_tad joydev evdev mac_hid serio_raw bluetooth rfkill ecdh_generic ecc coretemp parport_pc ppdev lp parport dm_mod fuse efi_pstore configfs nfnetlink efivarfs ip_tables x_tables autofs4 ext4 crc32c_generic crc16 mbcache jbd2 hid_multitouch hid_generic i2c_designware_platform i2c_designware_core ccp rtsx_pci_sdmmc i2c_hid_acpi mmc_core i2c_hid intel_lpss_pci xhci_pci drm intel_lpss nvme xhci_hcd i2c_i801 rtsx_pci idma64 video crc32_pclmul crc32c_intel psmouse usbcore thunderbolt i2c_smbus nvme_core mfd_core usb_common fan hid battery wmi pinctrl_alderlake
Feb 16 00:53:26 pnb kernel: CR2: 000000000000026c
Feb 16 00:53:26 pnb kernel: ---[ end trace 0000000000000000 ]---
Feb 16 00:53:26 pnb kernel: RIP: 0010:nvAssignDefaultUsageBounds+0xa8/0x1d0 [nvidia_modeset]
Feb 16 00:53:26 pnb kernel: Code: 43 0c 00 89 43 08 49 8b 84 24 10 0c 00 00 48 8b 80 98 01 00 00 48 85 c0 74 a0 4c 89 e7 e8 10 a5 fb ff ba 04 00 00 00 48 63 ca <8b> 7c c8 08 85 ff 74 60 48 8d 04 c8 0f b7 48 08 66 89 4b 04 0f b7
Feb 16 00:53:26 pnb kernel: RSP: 0018:ffff8881b1c9f800 EFLAGS: 00010206
Feb 16 00:53:26 pnb kernel: RAX: 0000000000000244 RBX: ffff88810a618058 RCX: 0000000000000004
Feb 16 00:53:26 pnb kernel: RDX: 0000000000000004 RSI: ffff88810a618040 RDI: ffffc90000189008
Feb 16 00:53:26 pnb kernel: RBP: ffffc900001896b8 R08: 0000000000000400 R09: 0000000000000400
Feb 16 00:53:26 pnb kernel: R10: 0000000000000000 R11: 0000000000000f00 R12: ffffc90000189008
Feb 16 00:53:26 pnb kernel: R13: ffff88810a618040 R14: ffff88810a618058 R15: ffffc90000189978
Feb 16 00:53:26 pnb kernel: FS:  00007f52a7342b00(0000) GS:ffff88901fc40000(0000) knlGS:0000000000000000
Feb 16 00:53:26 pnb kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb 16 00:53:26 pnb kernel: CR2: 000000000000026c CR3: 000000095af24006 CR4: 0000000000772ef0
Feb 16 00:53:26 pnb kernel: PKRU: 55555554
Feb 16 00:53:26 pnb kernel: note: Xorg[987727] exited with irqs disabled
Feb 16 00:53:27 pnb systemd[1]: NetworkManager-dispatcher.service: Deactivated successfully.
Feb 16 00:53:27 pnb kernel: usb 1-1: USB disconnect, device number 69
Feb 16 00:54:00 pnb systemd-logind[784]: Lid closed.
Feb 16 00:54:02 pnb systemd-logind[784]: Lid opened.
Feb 16 00:54:07 pnb kernel: NVRM: rm_power_source_change_event: rm_power_source_change_event: Failed to handle Power Source change event, status=0x11
Feb 16 00:54:09 pnb kernel: NVRM: rm_power_source_change_event: rm_power_source_change_event: Failed to handle Power Source change event, status=0x11

nvidia-bug-report.log.gz (381.3 KB)

It’s not a bug. The 4060 reports 52ish watts minimum usage always so Nvidia likely made it harder to get power usage info to avoid whining.

For me 4060 report nothing at all with nvidia-smi always N/A
4060TI show 6W idle and more under load, so defo something broken, I have 3 GPU 4060 and all have same issue one is MSI so its not due brand or vbios its NV issue
{08C74947-9695-4308-898B-03FEDE59F54E}
as you see gpu is under load 59C P0 and N/A

What part of “it’s not a bug” do you not understand? Nvidia cheeped out at the hardware level. It does not have the ability to read low power usage via standard power monitoring. The only alternative is to report the 52 watt minimum power draw, but then you’d complain about 52 watts being drawn 24/7.

See? 52 watts at minimum.

Edit: zexzez is just trolling.

Ok but do you look at my nvidia-smi i push card to near 100W, and it still show N/A

Flagging my comment for no reason and willfully ignoring what was said. Please go to Reddit if you want to troll.

I get the same here with 570.86.16, so it’s unusable for me.

Resume from suspend has AFAIK never worked and doesn’t work with Wayland. Btw fbdev=1 seems to be causing issues with Wayland (X probably too) and seems to be implicitly enabled for 570 beta driver so you probably don’t need to add that to the grub conf. I have set it to 0 and it’s seems it has solved all my Wayland and Xorg issues. Except resume from suspend on Wayland, which has never worked AFAIK. However I do not use a laptop. Laptops usually have more complicated configurations and experience more issues. In theory I could enable/use hybrid graphics but I don’t really need that.

just FYI: I have fbdev=1 on X and it works without issues in my case (3090 eGPU, debian-13, Intel laptop)

Only issue I had with X was it would crash (or rather gracefully shut down) when I start VBox VM after resuming from suspend. You tested this scenario? OTOH the issue I was having on Wayland was that Gnome would freeze when I entered full screen when watching high res video in FF (Later Chrome too). This issue has been re-appearing and happening to various people over the years in one form or another. Like week ago egl update from the graphics drivers PPA had solved it, or so I thought, then it happened again. However, not that I have set fbdev to 0 I am not able to reproduce it anymore.

ah no: I never use suspend-to-ram (only hibernate), which probably saves me a lot of headaches (for the price of about 10-20s each time I hibernate). Also, since I use my 3090 as an eGPU and hibernating usually means moving physically and leaving eGPU behind, i terminate all processes running on it first…

1 Like

Bump.

Hi, which software are you using? Thanks in advance.

On Ubuntu 22.04, kernel 6.5 and was on v550 and CUDA 12.4 with everything working great. I updated to 570 and had multiple issues:

  • Fans started running all the time with elevated temps, even when idling
  • Running models on dual RTX 3090s was way slower
  • Machine would no longer sleep (suspend). Multiple instances of the following in the log:

Feb 17 05:46:27 my_machine kernel: [28471.713876] nvidia 0000:08:00.0: PM: pci_pm_suspend(): nv_pmops_suspend+0x0/0x50 [nvidia] returns -5

1 Like

Resuming from suspend (on Xorg) then logging to a Wayland session reintroduces the issue where Gnome will freeze when a video goes full screen. The issue doesn’t manifest if I don’t suspend-resume before playing videos in Wayland session. IIRC w/o fbdev=0 videos would freeze even if I didn’t resume form suspend before playing them. Also, setting nvidia fbdev to 0 has also helped with the issue where X11 session would shut down when I start VBox VM after resuming from suspend.

I have the same with 1080.
550 works fine.

I wanted to share a brightness control issue I’ve encountered on my Lenovo LOQ laptop. The laptop has a RTX 4060 GPU, and is running Debian Trixie Alpha 1, KDE 6 and the NVIDIA BETA Driver (Version: 570.86.16).

Using the hardware keys to adjust brightness can lead to system crashes. However, brightness adjustments via the command line work fine with the command:

echo [brightness_level] | sudo tee /sys/class/backlight/nvidia_0/brightness

For anyone interested, I recommend checking out my detailed HTML page for more information:
http://mentalnet.xyz/debtrixloq/brightness-issue.html

I’m using this setup primarily as an AI rig, so I’m not overly concerned about the brightness issue, but I thought it would be helpful for others to know! (P.S. Hope this is the right spot; I’m not trying to navigate a labyrinth of categories here! 😂)

The only issue I am having so far on Fedora 41 and KDE 6.3 is the intermittent freeze of one of the two monitors as described in this thread: Screen freezes on Wayland with *ERROR* [nvidia-drm] [GPU ID 0x00002d00] Flip event timeout on head 1

Switching to a terminal TTY and back fixes it 99% of the time. 1% of the time kwin will restart with a lot of running applications crashing.

Installed this driver on 2 of my notebooks, wattage and temps also increased.

I also noticed my wattage the first time i use nvidia-smi, my iGPU is running mostly but when i use the nvidia-smi the dGPU is enabled and the wattage is insane, i suspect a bug because my gpu did not get fried lol.

Everytime the iGPU takes over and you re-run the nvidia-smi command the wattage will be like the pic above. When i run the command again (dGPU is active) the wattage is between 11W and 40W

1 Like

This has nothing to do with anything I said.