580 release feedback & discussion

After some uptime, I’m seeing these errors:

[345651.753511] hid-generic 0003:043E:9A39.0020: hiddev3,hidraw8: USB HID v1.11 Device [LG Electronics Inc. LG Monitor Controls] on usb-0000:00:14.0-4.3.3/input0
[345909.493352] NVRM: nvCheckOkFailedNoLog: Check failed: Out of memory [NV_ERR_NO_MEMORY] (0x00000051) returned from _memdescAllocInternal(pMemDesc) @ mem_desc.c:1359
[345909.493355] NVRM: _kgmmuClientShadowFaultBufferPagesAllocate: Allocation failed with big page size, retrying with default page size
[345909.514176] NVRM: nvCheckOkFailedNoLog: Check failed: Out of memory [NV_ERR_NO_MEMORY] (0x00000051) returned from _memdescAllocInternal(pMemDesc) @ mem_desc.c:1359
[345909.514180] NVRM: nvCheckOkFailedNoLog: Check failed: Out of memory [NV_ERR_NO_MEMORY] (0x00000051) returned from rmStatus @ system_mem.c:354
[345909.514189] NVRM: nvAssertOkFailedNoLog: Assertion failed: Out of memory [NV_ERR_NO_MEMORY] (0x00000051) returned from pRmApi->Alloc(pRmApi, device->session->handle, isSystemMemory ? device->handle : device->subhandle, &physHandle, isSystemMemory ? NV01_MEMORY_SYSTEM : NV01_MEMORY_LOCAL_USER, &memAllocParams, sizeof(memAllocParams)) @ nv_gpu_ops.c:4922

Usually, running raytraced games previously provokes this behavior much earlier.

I suspect that this also causes crashes in Chrome and plasma-shell.

nvidia-bug-report.log.gz (763,9 KB)

Did you get a chance to check on latest 580 driver?

Driver Details | NVIDIA new driver:

  • Added a new environment variable, CUDA_DISABLE_PERF_BOOST, to allow for disabling the default behavior of boosting the GPU to a higher power state when running CUDA applications. Setting this environment variable to ‘1’ will disable the boost.

  • Fixed an issue that caused the vfio-pci module to soft lockup after powering off a VM with passed-through NVIDIA GPUs.

  • Fixed a recent regression which prevented HDMI FRL from working after hot unplugging and replugging a display.

  • Fixed a bug that caused Rage2 to crash when loading the game menu:
    https://forums.developer.nvidia.com/t/rage-2-crashes-when-entering-the-map-seems-nvidia-specific-problem/169063

  • Fixed a bug that caused Metro Exodus EE to crash:
    https://forums.developer.nvidia.com/t/580-release-feedback-discussion/341205/53

  • Fixed a bug that allowed VRR to be enabled on some modes where it isn’t actually possible, leading to a black screen.

  • Fixed a bug that could cause some HDMI displays to remain blank after unplugging and re-plugging the display.

  • Fixed an issue that would prevent large resolution or high refresh rate modes like 7680x2160p@240hz from being available when using HDMI FRL or DisplayPort.

3 Likes

I packaged the new 580.105.08 Production Branch driver for Arch this morning.

Put it through a variety of graphical tests and benchmarks.

No regressions or new issues noted.

No bump to the Vulkan API, it’s still 1.4.312.

I’ll give it a full week of daily usage under both hyprland and sway before updating my issue-tracking thread.

  • kernel 6.17.7
  • hyprland @8e9add2 (-DNO_XWAYLAND:STRING=true)
  • sway @055be4e + !8715 (Native Wayland)
  • wlroots @604fcdb + !5071 (Vulkan Backend) (-Dxwayland=disabled)
  • egl-wayland2 1.01 @e16cb0f
  • vulkan-icd-loader 1.4.328.1
  • nVidia 580.105.08
1 Like

Oh my. I am looking forward to testing it :)

  • Added a new environment variable, CUDA_DISABLE_PERF_BOOST, to allow for disabling the default behavior of boosting the GPU to a higher power state when running CUDA applications.

What is the use case here? Energy consumption?

Likely yes, I’d imagine something like a video player that uses something from cuda displaying an output in real time (respecting the video’s timings, not the fastest possible) and keeps the GPU at high power for no reason.

1 Like

Did you create a PKGBUILD based on the current stable one in extra repo?

You can build with the latest PKGBUILD by just bumping the pkgver.

I prefer to carry my own slimmer PKGBUILD that builds only the proprietary driver and does not install several of the various unneeded X11-related bits as I build my environments without Xwayland support.

Oh, and fewer dependencies. I haven’t had egl-wayland installed for several months now, running solely on egl-wayland2 instead. Eventually that will be released and incorporated into the official arch packaging as a dep I’m guessing.

1 Like

Thanks,

Super, thank you. I use nvidia-open-dkms, it seems I need to rebuild nvidia-utils and not nvidia. But easy peasy.

Did you update nvidia-settings though? I think the git version is still on 580.95.05, and this is the source of the official PKGBUILD (from https://github.com/NVIDIA/nvidia-settings/archive/)

Yup, that’s the one I modify. It builds both open and proprietary for dkms by default.

No, I’ve not had that installed in many releases. It serves no purpose for my use.

Thanks. Regarding nvidia-settings, the official package from extra is built from the github release, and inexplicably not from the .run like nvidia-utils. Actually the independent package nvidia-utils-beta in the AUR builds everything from the .run archive. This is strange.

That said I was able to build the driver, nvidia-utils, opencl and lib32 libraries smoothly - as you mentioned, just by bumping the rel version. Neat!

Keep in mind going forward that new patches may be required and/or current patches dropped or need be rebased.

And, being kernel modules, rebuilds are required for gcc, glibc, etc bumps.

In general, unless you’ve specific reasons (e.g. my modified dependencies)… in this case I’m guessing you’re simply wanting to run the new driver before it hits extra-testing… prolly best to stick with Arch builds so you don’t get stung by any surprises.

Hit up the relevant Arch forum if you hit any issues.

The .run does not have nvidia-settings sources, you could install the prebuilt binary if really wanted to though.

Note that this is (typically) updated before the github tags, and that’s what I use to build it on Gentoo:

Edit: albeit I was only able to do a x86-64 bump in Gentoo given the aarch64 .run is still missing for some reason right now

Regarding my previous post new installation attempt, I reinstalled the Nvidia driver about 20 minutes ago via the additional drivers option.
This time, no terminal or console was accessible for logging in. I had to force a shutdown of the PC using the power button.
The Nvidia bug report could neither be saved nor created. It was presumably deleted during the driver uninstallation.

CUDA_DISABLE_PERF_BOOST environment variable seems to cut power usage in half using va-api on firefox decoding a 4k 60 fps youtube video. My wattage was hovering around 47-52 watts on my 4070 ti now, it using 24-26 watts.

1 Like

Very low performance when working with a 4K(3840x2160) display. It feels like the number of fps is dropping and this causes laggy animations. Lower the resolution to 2k or 1080, everything becomes smooth.
nvidia-smi -i 0 --lock-gpu-clocks=900,2130 seems to improve the situation a little. Also, when any load is performed, for example, watching a video in firefox through nvidia-vaapi-driver, the interface seems to lag less than usual.

Nvidia RTX 3090, nvidia 580.105.08-1, mesa 25.2.6-1.1 (also mesa-git 26.0), linux 6.18.0-rc4. Gnome wayland with compiled branch Making sure you're not a bot! still lags. Same problem on KDE 6.5 wayland.

1 Like

HDMI won’t detect the native resolution (2560*1440) of my monitor after upgrading from 580.95.05 to 580.105.08, tried both proprietary and open kernel module and happens to both Wayland and X11. DisplayPort works fine though

RTX3060, Debian forky, Linux 6.16.12, Gnome 48.5

2 Likes

I just got a black screen while resizing a window with VRR set set to “always”, so it appears that blackscreens are still not fixed
CachyOS, Plasma 6.5.1
NVIDIA-SMI 580.105.08 Driver Version: 580.105.08 CUDA Version: 13.0
RTX4090