That phrase I’ve described was a quote from the driver release highlights
i’ve never used a display manager. i always login through tty however still having the problem sadly. seems a lot of people on the hyprland discord server are having this issue, all with nvidia gpus too
The most recent driver release has caused another regression in Gamescope where some games, like Path of Exile 2, fail to launch inside of Gamescope with NVRM: dmaAllocMapping_GM107: Virtual address 0xACE0000 is not compatible with page size 0x200000 or page offset 0x0.
This does not happen on the previous 570 release.
Nvidia bug reports are attached to the above issue
dmesg got spammed by this again, seems to be because or rpcs3 (ps3 emulator).
WARNING: CPU: 11 PID: 1049 at drivers/gpu/drm/drm_prime.c:326 drm_gem_prime_fd_to_handle.cold+0x6c/0x73
Nvidia bug report is in the previous post.
Running 6.13.5-arch1-1 with 570.124.04-2 and when I turn off my monitor and turn it back on the screen is black and my second monitor that I didn’t turn off is frozen. Can’t move cursor or nothing. Only a hard reset turn things back to normal. This didn’t happen before with older drivers.
I’m getting the error: nvidia-modeset: ERROR: GPU:0: Error while waiting for GPU progress
nvidia-bug-report.log.gz (1.3 MB)
Updated to 570.124.04, and the kernel modules load fine, but nvidia-smi doesn’t find any devices and nothing can use my card. inxi shows that it’s loaded and so does lsmod. Using openSUSE Tumbleweed with an Intel iGPU and a 2070 Max-Q. Already tried disabling fbdev and I wasn’t using the GSP
edit: was a permissions issue, not related to the driver
I am facing the issue ( actually since 560 ) where I can run any application only once.
After this I cant open anything on the gpu anymore.
I am using an Intel+Nvidia Laptop .
But when I enable the MUX mode where only the nvidia dgpu is enabled, the issue remains the same.
I test this issue using the Superposition benchmark.
At any point the application begins throwing errors.
Mär 01 19:16:25 crashtux kernel: NVRM: GPU at PCI:0000:01:00: GPU-f6cd1b06-0e50-622e-08d9-5b44281bcb65
Mär 01 19:16:25 crashtux kernel: NVRM: Xid (PCI:0000:01:00): 62, 221e65ad 50e010e7 00000000 202a529e 202a4938 202a61f6 202a6008 202a5f7e
Mär 01 19:16:25 crashtux kernel: NVRM: Xid (PCI:0000:01:00): 154, GPU recovery action changed from 0x0 (None) to 0x1 (GPU Reset Required)
This Benchmark however still keeps running
Opening any other application now immediately crashes
Mär 01 19:38:38 crashtux kwin_wayland[1321]: kwin_scene_opengl: 0x4: GL_INVALID_OPERATION in glDrawBuffers(unsupported buffer GL_BACK_LEFT)
Mär 01 19:38:38 crashtux kwin_wayland[1321]: kwin_scene_opengl: 0x4: GL_INVALID_OPERATION in glDrawBuffers(unsupported buffer GL_BACK_LEFT)
Mär 01 19:38:38 crashtux kwin_wayland[1321]: kwin_scene_opengl: 0x4: GL_INVALID_OPERATION in glDrawBuffers(unsupported buffer GL_BACK_LEFT)
Mär 01 19:38:38 crashtux kwin_wayland[1321]: kwin_scene_opengl: 0x4: GL_INVALID_OPERATION in glDrawBuffers(unsupported buffer GL_BACK_LEFT)
Mär 01 19:38:38 crashtux kwin_wayland[1321]: kwin_scene_opengl: 0x4: GL_INVALID_OPERATION in glDrawBuffers(unsupported buffer GL_BACK_LEFT)
Mär 01 19:38:38 crashtux kwin_wayland[1321]: kwin_scene_opengl: 0x4: GL_INVALID_OPERATION in glDrawBuffers(unsupported buffer GL_BACK_LEFT)
Mär 01 19:38:50 crashtux kernel: NVRM: rpcRmApiAlloc_GSP: GspRmAlloc failed: hClient=0xc1d00039; hParent=0xcef80000; hObject=0xbeef0100; hClass=0x0000c56f; paramsSize=0x00000170; paramsStatus=0x00000062; status=0x00000062
Mär 01 19:38:50 crashtux kernel: NVRM: nvAssertOkFailedNoLog: Assertion failed: Reset required [NV_ERR_RESET_REQUIRED] (0x00000062) returned from status @ kernel_channel.c:2874
Mär 01 19:38:50 crashtux kernel: NVRM: nvAssertOkFailedNoLog: Assertion failed: Reset required [NV_ERR_RESET_REQUIRED] (0x00000062) returned from _kchannelSendChannelAllocRpc(pKernelChannel, pChannelGpfifoParams, pKernelChannelGroup, bFullSriov) @ kernel_channel.c:941
Mär 01 19:38:50 crashtux kernel: NVRM: rpcRmApiAlloc_GSP: GspRmAlloc failed: hClient=0xc1d00039; hParent=0xcef80000; hObject=0xbeef0100; hClass=0x0000c56f; paramsSize=0x00000170; paramsStatus=0x00000062; status=0x00000062
Mär 01 19:38:50 crashtux kernel: NVRM: nvAssertOkFailedNoLog: Assertion failed: Reset required [NV_ERR_RESET_REQUIRED] (0x00000062) returned from status @ kernel_channel.c:2874
Mär 01 19:38:50 crashtux kernel: NVRM: nvAssertOkFailedNoLog: Assertion failed: Reset required [NV_ERR_RESET_REQUIRED] (0x00000062) returned from _kchannelSendChannelAllocRpc(pKernelChannel, pChannelGpfifoParams, pKernelChannelGroup, bFullSriov) @ kernel_channel.c:941
Mär 01 19:38:50 crashtux kernel: NVRM: rpcRmApiAlloc_GSP: GspRmAlloc failed: hClient=0xc1d0003d; hParent=0xcef80000; hObject=0xbeef0100; hClass=0x0000c56f; paramsSize=0x00000170; paramsStatus=0x00000062; status=0x00000062
Mär 01 19:38:50 crashtux kernel: NVRM: nvAssertOkFailedNoLog: Assertion failed: Reset required [NV_ERR_RESET_REQUIRED] (0x00000062) returned from status @ kernel_channel.c:2874
Mär 01 19:38:50 crashtux kernel: NVRM: nvAssertOkFailedNoLog: Assertion failed: Reset required [NV_ERR_RESET_REQUIRED] (0x00000062) returned from _kchannelSendChannelAllocRpc(pKernelChannel, pChannelGpfifoParams, pKernelChannelGroup, bFullSriov) @ kernel_channel.c:941
Mär 01 19:38:50 crashtux kernel: NVRM: rpcRmApiAlloc_GSP: GspRmAlloc failed: hClient=0xc1d0003d; hParent=0xcef80000; hObject=0xbeef0100; hClass=0x0000c56f; paramsSize=0x00000170; paramsStatus=0x00000062; status=0x00000062
Mär 01 19:38:50 crashtux kernel: NVRM: nvAssertOkFailedNoLog: Assertion failed: Reset required [NV_ERR_RESET_REQUIRED] (0x00000062) returned from status @ kernel_channel.c:2874
Mär 01 19:38:50 crashtux kernel: NVRM: nvAssertOkFailedNoLog: Assertion failed: Reset required [NV_ERR_RESET_REQUIRED] (0x00000062) returned from _kchannelSendChannelAllocRpc(pKernelChannel, pChannelGpfifoParams, pKernelChannelGroup, bFullSriov) @ kernel_channel.c:941
Mär 01 19:38:50 crashtux kwin_wayland[1321]: kf.windowsystem: static bool KX11Extras::mapViewport() may only be used on X11
Mär 01 19:38:51 crashtux kernel: NVRM: rpcRmApiAlloc_GSP: GspRmAlloc failed: hClient=0xc1d0003d; hParent=0xcef80000; hObject=0xbeef0100; hClass=0x0000c56f; paramsSize=0x00000170; paramsStatus=0x00000062; status=0x00000062
Mär 01 19:38:51 crashtux kernel: NVRM: nvAssertOkFailedNoLog: Assertion failed: Reset required [NV_ERR_RESET_REQUIRED] (0x00000062) returned from status @ kernel_channel.c:2874
Mär 01 19:38:51 crashtux kernel: NVRM: nvAssertOkFailedNoLog: Assertion failed: Reset required [NV_ERR_RESET_REQUIRED] (0x00000062) returned from _kchannelSendChannelAllocRpc(pKernelChannel, pChannelGpfifoParams, pKernelChannelGroup, bFullSriov) @ kernel_channel.c:941
Mär 01 19:38:51 crashtux kernel: NVRM: rpcRmApiAlloc_GSP: GspRmAlloc failed: hClient=0xc1d0003d; hParent=0xcef80000; hObject=0xbeef0100; hClass=0x0000c56f; paramsSize=0x00000170; paramsStatus=0x00000062; status=0x00000062
Mär 01 19:38:51 crashtux kernel: NVRM: nvAssertOkFailedNoLog: Assertion failed: Reset required [NV_ERR_RESET_REQUIRED] (0x00000062) returned from status @ kernel_channel.c:2874
Mär 01 19:38:51 crashtux kernel: NVRM: nvAssertOkFailedNoLog: Assertion failed: Reset required [NV_ERR_RESET_REQUIRED] (0x00000062) returned from _kchannelSendChannelAllocRpc(pKernelChannel, pChannelGpfifoParams, pKernelChannelGroup, bFullSriov) @ kernel_channel.c:941
Mär 01 19:38:51 crashtux kernel: QSGRenderThread[6552]: segfault at 1 ip 00007414ec6baf55 sp 00007414ceffce40 error 4 in libnvidia-glcore.so.570.124.04[6baf55,7414ec400000+c00000] likely on CPU 8 (core 16, socket 0)
Mär 01 19:38:51 crashtux kernel: Code: 5d e9 5f d5 18 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 05 a1 50 b3 01 55 64 48 8b 28 83 ff 0f 0f 87 32 bb bb ff <0f> b6 46 01 48 8d 15 40 6c b9 01 44 8b 4c 82 34 0f b6 46 02 44 8b
Mär 01 19:38:51 crashtux systemd-coredump[6559]: Process 6527 (launcher) of user 1000 terminated abnormally with signal 11/SEGV, processing...
Mär 01 19:38:51 crashtux systemd[1]: Created slice Slice /system/systemd-coredump.
Mär 01 19:38:51 crashtux systemd[1]: Started Process Core Dump (PID 6559/UID 0).
Mär 01 19:38:51 crashtux systemd-coredump[6560]: [🡕] Process 6527 (launcher) of user 1000 dumped core.
Stack trace of thread 6552:
#0 0x00007414ec6baf55 n/a (n/a + 0x0)
ELF object binary architecture: AMD x86-64
It doesnt matter whether I am on X11 or wayland.
Enabling/Disabling VRR doesnt make a difference either.
The only thing that appears to make a difference is to set a 60Hz refresh rate rather than the 120Hz where It immediately crashes.
OS: CachyOS
Kernel: Linux 6.13.5-2-cachyos
CPU: 13th Gen Intel(R) Core(TM) i9-13900HX (32) @ 5.40 GHz
GPU 1: NVIDIA GeForce RTX 4080 Max-Q / Mobile [Discrete]
DE: KDE Plasma 6.3.2
Display (PG48UQ): 3840x2160 @ 120 Hz in 31" [External]
NvidiaDriver: 570.124.04
What does the Xid 154 mean ?
The Xid 62 happens on 555.58 as well but the gpu doesnt crash there.
nvidia-bug-report.log.gz (616.6 KB)
Okay I discovered something weird there.
I can reproduce the freeze now ( albeit not as fast ) on 550 and 555 as well
The freeze happens but the Xid 154 does not appear there.
So I get it’s got something to do with something surrounding it.
But I dont get anything else with the journal remaining empty.
Can anyone help me please ?
My fallback is gone now and I cant use my system anymore
UPDATE:
I tested various configurations by now.
There are at least two things that make these issues appear.
Kernel 6.12+
The last configuration that appears to work in my case is kernel 6.11.9 with nvidia version 555.58.02.
The moment I update to the kernel versions 6.12 or 6.13 the crashes start to occur even on 550.
Driver 560+
Starting with the 560 driver the freezes/crashes always occur.
The kernel version doesnt matter in this case ( tested from 6.8 to 6.14RC )
So what’s the point of having a great feature and then remove it? It’s such a nvidia thing to remove voltage monitoring from nvidia-smi. Is there any other method to do it? I really love my mV on mangohud.
570.124.04 is unusable and hangs with 6.13.5 kernel.
On my system (AMD Threadripper Pro/RTX4090) the 570.124.04 driver hangs when X tries to start up and I get the following errors:
Mar 1 03:22:23 flash kernel: nvidia-modeset: ERROR: GPU:0: Idling display engine timed out: 0x0000c67e:6:0:1230
Mar 1 03:46:18 flash kernel: nvidia-modeset: ERROR: GPU:0: Error while waiting for GPU progress: 0x0000c77d:0 2:0:1500:1488
The latter just repeats.
I have four monitors attached.
If anyone else is having this problem, I was able to get it working on linux 6.13 by loading kernel modules for drm_ttm_helper, and drm_client_lib before running the .run file for the installer.
i think they would need to expose voltage monitoring through nvml, which i have a feeling they are reluctant about for some reason.
Unfortunately this issue persists with the 570.124.04 Production Branch driver:
In that original thread I’ve provided:
- nvidia-bug-report
- strace/perf report of the offending cpu-consuming threads
- flamegraphs of the perf reports
- Full GDB backtrace of crash
This is 100% reproducable by running any GTK4 app with GTK 4.16.x using the default Vulkan backend renderer.
Current stack:
- Arch Linux | Kernel 6.13.5
- Gnome-shell | Mutter 47.4
- Wayland (meson_options: xwayland=false, x11=false)
- Gtk4 4.16.12
- Mesa 24.3.4
- vulkan-icd-loader 1.4.304
nVidia 570.124.04
Was hoping I could actually use the new driver, and then I just get this again
These days we are just testers for nvidia, they sell their cards with buggy drivers. This is going on for over 2 years by now (Windows and Linux). Are all “good” engineers from years ago fired ?
This is 100% reproducable by running any GTK4 app with GTK 4.16.x using the default Vulkan backend renderer.
Same for me, I only just switched to NVIDIA so took me a while to figure out this was even an issue with the GPU driver and not some gnome update. Nautilus was going crazy, just moving it from one monitor to another instantly bricked it until gnome was restarted.
Setting GSK_RENDERER=cairo in environment “fixed” it for me.
Gentoo Kernel 6.13.5
Mutter 47.5 w/triple buffering
Wayland 1.23.1
GTK 4.16.12
vulkan-loader 1.4.304.0
Nvidia driver 570.86.16 / 570.124.04
Looks like 570 drivers have some problems witch vulkan in wayland app. at least some games are hangs when using dxvk witch wine wayland mode (DISPLAY=) apparently this only happens with nvidia driver.
Here is a small list of games where this can be reproduced:
WILL A Wonderful World
terraria
Mondealy
signalis
Dread Delusion
Deus Ex Revision
this does not happen when working through xwayland.
can anyone check it?
Which version? Btw consider there’s a separate package for nvidia implementation. On Ubuntu for example one has:
libwayland-egl1 (and i387 version)
and
libnvidia-egl-wayland1 (plus i386 version of the lib)
For more recent version of the package you would have to add graphic drivers PPA or install it manually or from wherever if you’re using a different distro. For Ubuntu and Ubuntu based distros it’s probably better to use the PPA and apt install nvidia-driver-570-open vs using Nvidia offical installer.
Are you experiencing the increased CPU usage on ‘idle’, crashing on exit, or both?
The issue you described (nautilus freeze?) sounds different from what I detail in my original post topic.
Regardless, the Cairo backend is intended for development testing, not meant for daily user use, and will have its own issues.
You’ll be far better served by using the NGL backend GSK_RENDERER=ngl. It was default prior to Vulkan, sees lots of testing, and is actively maintained.
