555 release feedback & discussion

EDIT: Issue resolved by installing and older driver first (530) then re-installing 555.42.02. I haven’t run apt-get autoremove because I’m afraid it will break it again :-P

Getting this error when trying to launch Cyberpunk 2077.
image

System:
Operating System: KDE neon 6.0
KDE Plasma Version: 6.0.4
KDE Frameworks Version: 6.2.0
Qt Version: 6.7.0
Kernel Version: 6.5.0-35-generic (64-bit)
Graphics Platform: Wayland
Processors: 16 × AMD Ryzen 7 5700X 8-Core Processor
Memory: 62.7 GiB of RAM
Graphics Processor: NVIDIA GeForce RTX 4060 Ti/PCIe/SSE2
Manufacturer: Micro-Star International Co., Ltd.
Product Name: MS-7D53
System Version: 1.0

It looks like this is really a problem. Now the Xwayland process uses 1.2 GIB video memory, while I have only Conky on Xwayland. Is there a way to independently clean the video memory?

I see this issue as well on an RTX 3080. Basically mixed refresh rates makes the desktop feel sluggish on occasions and sometimes feel like its hitched on a frame.

Additional testing so far:

Now running a 3080. I have a samsung QN90b which I run as a monitor set to 3840x1600 which is provided through their ‘game mode’.

On Nobara 39, KDE 6.0.5-1

I cannot get colour space to set properly for HDR when using DP, but it is ok with HDMI 2.1. This could be related to the converter cable though. On HDMI HDR sets correctly.

On HDMI 2.1 I cannot run 3840x1600 at 120hz. I can do it on DP with a converter to HDMI 2.1 and resolution and refresh set correctly, in this circumstance Adaptive sync can be set but wont enable as converter does not support it. If I try to set it on the HDM2.1 then the screen becomes a pulsing garbled mess at the top portion of the screen. Literally all other resolutions work correctly, just not the one I usually use with preferred refresh.

I think gaming performance has reduced in some titles, and noted that the GPU usage sometimes drops unexpectedly to 70% or so. This I need to investigate further.

Keep an eye on plasmashell too
https://bugs.kde.org/show_bug.cgi?id=469094

2 Likes

I noticed my 3080 is idling at ~100w for some reason. I am not sure if this is a 555 bug, or due to some configuration I made.

1 Like

Using the 555 beta with a 2080Ti using openSUSE, KDE Plasma 6 and Wayland.
555 has finally solved X11 apps (Steam, Electron-based apps) flickering.
KWin was quirky with the open source module (I’m using two monitors; one monitor was always 60 Hz, the other somewhere between 30 Hz and 60 Hz). Using the closed source module and NVreg_EnableGpuFirmware=0 seems to fix that.
Both Steam streaming (Steam Link) and Sunshine report NVENC broken, though ffmpeg and HandBrake seem to have no problem hardware encoding video. Steam says:

>>> Switching video stream from NONE to Desktop_MovieStream
[...]
Video Encoder Changed
>>> Capture method set to Desktop Placeholder + libyuv + NVENC HEVC
>>> Capture resolution set to 1920x1080
>>> Capture colorspace set to BT.601 limited range
NVENC - Submitting frame for encoding failed: 20
CLIENT: Texture rect: 2960x1440 at 0,0
CLIENT: Video rect: 2960x1440 at 0,0
CLIENT: Got control packet k_EStreamControlSetTouchConfigData
CLIENT: Ignoring duplicate controller configuration for 769, revision 22
CLIENT: Got control packet k_EStreamControlSetTouchConfigData
Caching cursor image for , serial 527, cache size = 0
Changing record window: 0x140007d
SynchronizeClientState(): setting cursor to 527
SynchronizeClientState(): setting activity to k_EStreamActivityDesktop: Steam Controller Configs - Desktop
>>> Client video decoder set to Android texture hardware decoding
CLIENT: Android texture hardware decoding
CLIENT: Got control packet k_EStreamControlStartVideoData
CLIENT: Got control packet k_EStreamControlVideoEncoderInfo
CLIENT: Got control packet k_EStreamControlSetCursorScale
CLIENT: Got control packet k_EStreamControlSetCursor
CLIENT: Got control packet k_EStreamControlSetActivity
CLIENT: Got control packet k_EStreamControlSetCursorImage
CGameStreamVideoStageVAAPI: Reinitializing 1920x1080 at 30000 Kbit/s and 59.75 FPS with quality 1
ffmpeg verbose: libva: VA-API version 1.21.0
ffmpeg verbose: libva: User environment variable requested driver 'nvidia'
ffmpeg verbose: libva: Trying to open /usr/lib/dri/nvidia_drv_video.so
ffmpeg verbose: libva: va_openDriver() returns -1
ffmpeg error: Failed to initialise VAAPI connection: -1 (unknown libva error).
CGameStreamVideoStageVAAPI: Failed to create device context: Input/output error
Detected 32 logical processors, using 4 threads
Configuring encoder: [threads=4][width=1920][height=1080][preset=superfast][tune=zerolatency]
Video Encoder Changed
>>> Capture method set to Desktop Placeholder + libyuv + libx264 main (4 threads)
CDesktopCaptureX11GL: OpenGL extension GLX_EXT_texture_from_pixmap not supported
>>> Capture method set to Desktop X11 RGB + libyuv + libx264 main (4 threads)
[...]

Error code 20 appears to be NV_ENC_ERR_GENERIC, so no help there.
The GPU idles around 15 watts, which I think is normal.

I have asus laptop with AMD Radeon 780M IGPU + NVIDIA GeForce RTX 4060 Mobile MAX-Q DGPU and KDE6 and Wayland session.
Laptop monitor connected to AMD GPU, external monitor connected to nvidia GPU (HDMI port).
When I run glxgears benchmark test in kde 6 wayland session on 555.42.02 nvidia-open driver or 555.42.02 nvidia proprietary driver without nvidia.NVreg_EnableGpuFirmware=0 kernel option on my external monitor connected to nvidia hdmi port I have low fps framerate equal to half the screen refresh rate (in my case I have only ~37-38 fps when external screen refresh rate 75).
This looks like a bug 452219 – Low fps and high CPU usage on external monitor connected to NVIDIA when default GPU is Intel but it nvidia diriver regression because on nvidia-open 550.xx driver or nvidia proprietary driver 555.42 drver wih nvidia.NVreg_EnableGpuFirmware=0 kernel option I have normal framerate on extenal monitor.
I can not use nvidia proprietary driver 550.xx or 555.42 because it causes the kernel to panic Series 550 freezes laptop - #135 by mario156090 and nvidia can not fix this problem more than 3 monthes.
I do not want use nvidia open driver 550.xx because with this driver and external monitor I have very large cpu utilization for kwin_wayland proccess

In my case I had to add nvidia-current-uvm in /etc/modules and restart

1 Like

Does this only happen after a suspend? I am seeing similar issues in Hellblade 2 but only after I suspend. Rebooting seems to return everything to normal.

PacketAuditor
I noticed my 3080 is idling at ~100w for some reason. I am not sure if this is a 555 bug, or due to some configuration I made.

This high power usage during idle seems to be a Nvidia problem as a whole, not just the driver. It happens with me on 545, 550 and 555, I don’t think anybody has a clue why

I have had Xwayland leak vram too, nvidia-smi reported it as using > 7 GiB of memory when it became unresponsive enough that I had to kill it. On sway, not kde though.

Edit: After restarting just Xwayland (and all x apps), no updates or anything, a few hours of use later its still at 12MiB, I will try to keep an eye on it and see what triggers it to start leaking.

I have installed this driver on Debian and can confirm that VAAPI does not work.
xorg driver is OK
Vulkan drivers are OK - games can be played
Everything was working great before the driver updated.
I have tried completely removing all of the drivers rebooting and reinstalling. Anything which tries to access nvidia_drv_video.so.

I am using:

echo $XDG_SESSION_TYPE
x11

Tested with:

$ vainfo
libva info: VA-API version 1.21.0
libva error: vaGetDriverNames() failed with unknown libva error
libva info: User environment variable requested driver 'nvidia'
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/nvidia_drv_video.so
libva info: Found init function __vaDriverInit_1_0
libva error: /usr/lib/x86_64-linux-gnu/dri/nvidia_drv_video.so init failed
libva info: va_openDriver() returns 1
vaInitialize failed with error code 1 (operation failed),exit

Settings:

LIBVA_DRIVER_NAME=nvidia
VDPAU_DRIVER=nvidia
NVD_BACKEND=direct

In modprobe.d/nvidia.conf

options nvidia_drm modeset=1
options nvidia NVreg_EnableGpuFirmware=0
options nvidia_drm fbdev=1

ltrace vainfo

ltrace vainfo
strrchr("vainfo", '/')                                                                                                                              = nil
getopt_long(1, 0x7ffccb61ae18, "a", 0x55dab27f11e0, 0)                                                                                              = -1
calloc(1, 56)                                                                                                                                       = 0x55dab3a4a330
wl_display_connect(0, 0, 56, 64)                                                                                                                    = 0
free(0x55dab3a4a330)                                                                                                                                = <void>
XOpenDisplay(nil)                                                                                                                                   = 0x55dab3a4a370
vaGetDisplay(0x55dab3a4a370, 0, 0, 0xeff5)                                                                                                          = 0x55dab3a57f60
vaInitialize(0x55dab3a57f60, 0x7ffccb61acc0, 0x7ffccb61acc8, 480libva info: VA-API version 1.21.0
libva error: vaGetDriverNames() failed with unknown libva error
libva info: User environment variable requested driver 'nvidia'
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/nvidia_drv_video.so
<no return ...>
--- SIGCHLD (Child exited) ---
libva info: Found init function __vaDriverInit_1_0
libva error: /usr/lib/x86_64-linux-gnu/dri/nvidia_drv_video.so init failed
libva info: va_openDriver() returns 1
<... vaInitialize resumed> )                                                                                                                        = 1
vaErrorStr(1, 0x55dab3a4b940, 0x55dfee025b3b, 33)                                                                                                   = 0x7f4ba698c058
__fprintf_chk(0x7f4ba694c4e0, 1, 0x55dab27eea50, 0x55dab27ee328vaInitialize failed with error code 1 (operation failed),exit
)                                                                                    = 62
free(0)                                                                                                                                             = <void>
free(0)                                                                                                                                             = <void>
vaTerminate(0x55dab3a57f60, 0x7ffccb61aaf0, 0, 0)                                                                                                   = 0
XCloseDisplay(0x55dab3a4a370)                                                                                                                       = <void>
+++ exited (status 3) +++

I have tried installing other va drivers, removing them, removing and reinstalling vdpau drivers, rebooting many times but nothing seems to change. This has been working find with many different driver updates in the past but it seems that this is not working.

I investigated further some of the performance issues and:

For me there does not appear to be an issue of waking from sleep. Things work the same from cold boot and sleep.

options nvidia NVreg_EnableGpuFirmware=0 definitely improves the desktop performance and the general gaming peformance. In games where I am able to reach around 99/100% GPU utilization the frametimes are better. Also it got completely rid of constant crashes in Alan Wake 2. I think here there might be some form of memory leak, as checking memory useage, with options nvidia NVreg_EnableGpuFirmware=1 the vram useage gets very quickly to 10GB, whereas with =0 it stays under the whole time. I still see frame spikes when moving, but again need to check further.

However I still have seen a number of games lose significant performance where the GPU is only utilized to 80% or so. For example Horizon zero dawn loses 30-40fps compared to max. If I switch from Wayland to X, then useage goes to 100% and I get the additional performance. In the past I had seen wayland performing on par with X, but I only really use Wayland now so not been keeping track for a few driver versions.

A log from when I am trying to set 3840x1600 at 120hz on HDMI2.1.

nvidia-bug-report.log.gz (686.5 KB)

Also actual for 555.52.04

(Cross-posting from github regarding the stutter issues, in case any folks here are willing to try compiling the open kernel modules with patches on top)

Update: We’ve found two possible causes of stutter. Or rather, we found two issues that definitely cause stutter on some configurations, but we still don’t have a good idea of how widespread either of them is.

I have published patches that eliminate one and log the other here: Patches for testing r555 stutter issues by mtijanic · Pull Request #658 · NVIDIA/open-gpu-kernel-modules · GitHub

I’d love it if folks that are experiencing these issues would give it a try and report back. Getting a good idea of the impact would help us prioritize getting these in. Many thanks in advance!

1 Like
vainfo
libva info: VA-API version 1.21.0
libva error: vaGetDriverNames() failed with unknown libva error
libva info: User environment variable requested driver 'nvidia'
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/nvidia_drv_video.so
libva info: Found init function __vaDriverInit_1_0
libva info: va_openDriver() returns 0
vainfo: VA-API version: 1.21 (libva 2.12.0)
vainfo: Driver version: VA-API NVDEC driver [egl backend]
vainfo: Supported profile and entrypoints

export LIBVA_DRIVER_NAME=nvidia
export MOZ_DISABLE_RDD_SANDBOX=1
export NVD_BACKEND=egl

use EGL…

No mention of all the other issues? What about stuttering issues with GSP off?

Netbeans has an easily to reproduce issue where if the GPU is at 100% usage, it becomes unusable.

VRAM management is still the worst on this driver version and trying to open a game or program on that will most likely fill the VRAM on Wayland will kill every process using the GPU (The game running, the compositor, everything) with the error Failed to allocate NVKMS memory for GEM object on dmesg:

...
[ 2660.476282] [drm:nv_drm_gem_alloc_nvkms_memory_ioctl [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to allocate NVKMS memory for GEM object
[ 2660.476656] [drm:nv_drm_gem_alloc_nvkms_memory_ioctl [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to allocate NVKMS memory for GEM object
[ 2660.516913] kwin_wayland[1743]: segfault at 10 ip 00007a47a78d87e4 sp 00007fff2a96bee8 error 4 in libkwin.so.6.0.5[7a47a7547000+52b000] likely on CPU 0 (core 0, socket 0)
[ 2660.516925] Code: 00 00 00 0f 1f 00 f3 0f 1e fa 48 8b 47 10 48 8b 38 ff 25 6f 0d 34 00 90 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa <48> 8b 47 10 48 8b 00 c3 0f 1f 40 00 f3 0f 1e fa 48 8b 47 10 48 8b
[ 2660.942238] flatpak[1184927]: segfault at f0 ip 0000620431525b11 sp 00007ffebea55990 error 4 in flatpak[6204313d4000+153000] likely on CPU 0 (core 0, socket 0)
[ 2660.942248] Code: e9 02 f4 ed ff 66 66 2e 0f 1f 84 00 00 00 00 00 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 41 54 48 63 c2 48 63 f6 55 53 48 8b 17 <48> 8b 14 f2 48 85 d2 74 7f 48 8b 12 48 8b 12 48 8b 1c c2 48 85 db

I already made a thread about this issue almost a year ago: Non-existent shared VRAM on NVIDIA Linux drivers

I will test this on my Laptop with a 3070 (8GB of VRAM).

I beg you, please fix this VRAM issue, is the only driver without shared RAM with the system.

4 Likes