Still freezes for me with 565.77 X11 KDE. @amrits what gives?
Do you folks have Prime installed? I was having issues where stuff on the dGPU was freezing under load if I had the iGPU enabled and didnāt have Prime installed. Once I installed Prime, the freezes stopped.
What is āPrimeā? I canāt find such package for Debian/Ubuntu.
It should be included in drivers now. You can run it with nvidia-settings command - it starts gui and you can switch prime profiles there.
Ok, but there are no default profiles:
Where could I get proper profiles and rules to avoid freezes?
Now I see, you dont even have a Prime Profiles menu option. To me it doesnt look like its supported on you computer.
I think in Debian/Ubuntu the package is called nvidia-prime.
sudo apt-get install nvidia-prime
There is no such package for Debian, but Iāve tried to install Ubuntuās one. And nothing changed.
According to the xrandr output Iāve got dGPU:
$ xrandr --listproviders
Providers: number : 2
Provider 0: id: 0x45 cap: 0xf, Source Output, Sink Output, Source Offload, Sink Offload crtcs: 4 outputs: 3 associated providers: 1 name:modesetting
Provider 1: id: 0x270 cap: 0x2, Sink Output crtcs: 4 outputs: 1 associated providers: 1 name:NVIDIA-G0
And according to glxinfo Iāve got two different render devices:
$ __GL_SYNC_TO_VBLANK=0 __NV_PRIME_RENDER_OFFLOAD=1 __GLX_VENDOR_LIBRARY_NAME=nvidia __VK_LAYER_NV_optimus=NVIDIA_only glxinfo |head -n5
name of display: :0
display: :0 screen: 0
direct rendering: Yes
server glx vendor string: NVIDIA Corporation
server glx version string: 1.4
$ glxinfo |head -n5
name of display: :0
display: :0 screen: 0
direct rendering: Yes
server glx vendor string: SGI
server glx version string: 1.4
So I definitely have two different GPUs in my laptop and PRIME is supported.
I have got ubuntu as well and it came whem I selected proprietary deivers during installation of ubuntu. Maybe thats the difference thatd do the whole setup for you. However you should be able to select that somewhere in update center and enable proprietary and maybe it will update? Dunno. But how do you know prime is supported, do you have a mux switch? For example I do. But Im pretty unsure about that prime itself fixes it anyway, but you can try⦠Btw I use wayland too.
PRIME doesnāt use mux switch. It uses shared memory for rendering. An I can check that dGPU is used for rendering f.e. through nvidia-smi:
$ nvidia-smi
Sat Dec 21 12:17:27 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.142 Driver Version: 550.142 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3050 ... Off | 00000000:01:00.0 On | N/A |
| N/A 47C P5 5W / 60W | 302MiB / 4096MiB | 2% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 1399 G /usr/lib/xorg/Xorg 115MiB |
| 0 N/A N/A 6880 G /usr/lib/vmware/bin/mksSandbox 177MiB |
+-----------------------------------------------------------------------------------------+
If Iāve run vmware in PRIME offload mode Iāve got string with /usr/lib/vmware/bin/mksSandbox telling us, that VMware uses NVidia dGPU. What it could be if not PRIME?
Ya, you donāt need a mux for prime. I was having GPU crashes on my desktop with iGPU on the CPU driving 2 monitors and an NV card driving the other 4. Putting the dGPU under stress caused the driver to crash. Installing Prime to mediate the shared resources between the two GPUs solved the crashes for me.
This has come up many times; itās either not a solution in all cases (i.e. freezes have still been observed) or itās not a solution because it locks in high power draw on a laptop.
I think nvidia-prime
is just a tool for managing profiles; itās not necessary for PRIME Render Offload on X (Chapter 35. PRIME Render Offload). Wayland is an entirely different story (I donāt think it even uses PRIME?).
Iām also affected by this bug. Iām not using the dedicated GPU; Iām using the On-Demand mode. Using the performance mode causes performance issues with GNOME Shell (under both Wayland and Xorg). Regarding this issue in particular:
- Host: Lenovo Legion Pro 5 16ARX8
- Kernel: Linux 6.11.0-13-generic
- Resolution: 2560 x1600 240 hz
- Resolution external monitor (HDMI): 1920 x1080, 74.97 hz
- SO: Ubuntu 24.10
- DE: GNOME 47
- WM: X11
- CPU: AMD Ryzen 9 7945HX with Radeon Graphics Ć 32
- GPU: AMD Radeon 610M
- GPU: NVIDIA GeForce RTX 4070 Laptop GPU
- Memory: 32,0 GiB
- Driver version: 560.35.03
The issue almost instantly occurs when resizing the glxgears window.
Lowering the refresh rate of the external monitor to 60 Hz works for me. So far, it hasnāt failed, even when trying to reproduce the error using vkcube and overlapping glxgears.
Update: No, the bug still occurs even with 60 Hz. Itās harder to reproduce, but I had YouTube on one screen while using Steam with my discrete card on the main screen, and it froze. Also, using Firefox + vkcube + glxgears causes the secondary screen to freeze as well.
i have been facing this issue for over a year now , any fix guys ??? , help a poor soul
Iāve made some investigations on 2-second freezing while resizing glxgears using 565.77 open kernel module drivers and this freezes are directly related to the internal timeout - open-gpu-kernel-modules/src/nvidia-modeset/src/nvkms.c at 9d0b0414a5304c3679c5db9d44d2afba8e58cc1b Ā· NVIDIA/open-gpu-kernel-modules Ā· GitHub in IdleBaseChannelAll
function:
/*!
* Idle all requested heads.
*
* First, wait for the heads to idle naturally. If a timeout is exceeded, then
* force the non-idle heads to idle, and record these in pReply.
*/
static NvBool IdleBaseChannelAll(
NVDevEvoPtr pDevEvo,
const struct NvKmsIdleBaseChannelRequest *pRequest,
struct NvKmsIdleBaseChannelReply *pReply)
{
NvU64 startTime = 0;
/*
* Each element in subDevicesPerHead[] must be large enough to hold one bit
* per subdevice.
*/
ct_assert(NVKMS_MAX_SUBDEVICES <=
(sizeof(pRequest->subDevicesPerHead[0]) * 8));
/* Loop until all head,sd pairs are idle, or we time out. */
do {
const NvU32 timeout = 2000000; /* 2 seconds */
/*
* Clear the pReply data,
* IdleBaseChannelCheckIdle() will fill it afresh.
*/
nvkms_memset(pReply, 0, sizeof(*pReply));
Matching linux kernel function_graph trace:
# tracer: function_graph
#
# function_graph latency trace v1.1.5 on 6.9.12
# --------------------------------------------------------------------
# latency: 0 us, #3/3, CPU#0 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:20)
# -----------------
# | task: -0 (uid:0 nice:0 policy:0 rt_prio:0)
# -----------------
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| /
# CPU TASK/PID |||| DURATION FUNCTION CALLS
# | | | |||| | | | | | |
6) Xorg-1464 | ..... | $ 2000502 us | } /* nvkms_unlocked_ioctl [nvidia_modeset] */
8) InputTh-1866 | ..... | $ 1998402 us | } /* nvkms_unlocked_ioctl [nvidia_modeset] */
7) nvidia--870 | ..... | $ 1997807 us | } /* nvkms_kthread_q_callback [nvidia_modeset] */
As you could see, Xorg calls nvidia_modeset ioctl through NVidiaās Xorg driver and it freezes for almost 2 seconds.
Iāve changed this value to 10 ms (10000) rebuilt module and glxgears began resizing flawlessly ⦠until my external monitor totally freezed. So I thing that there are two different issues in the NVidia driver: first of them is caused by waiting for rendering queue to become idle and second is undetectable at this moment. Linux kernel function_graph trace subsystem canāt help to detect this issue or may be I donāt know how to setup it properly. :-(
PS: I canāt understand why NVidiaās developers use 2 second timeout in such case. As for me it is almost equivalent to eternity regarding to the monitor frame rate.
For what itās worth, Iāve discovered a workaround. If I boot into BIOS and change the option to āDedicated Graphicsā from āHybrid Graphicsā, the problem no longer occurs. Obviously, this is far from ideal, as the dGPU never powers down while in this setting and chews up battery, but itās better than nothing. This is on a Lenovo LOQ 15 with RTX4050 and Ryzen7 8845HS.
I have the same problem on Arch linux. I tried many different driver version / kernel version combinations over two days. Nothing worked.
My workaround at the moment is to go back to proprietary driver version 525.147.05.
No problems at all with this version.
Old version driver is provided in AUR for Arch Linux:
https://aur.archlinux.org/packages?O=0&K=nvidia-525
Installed packages for a working setup:
pacman -Q | grep nvidia
lib32-nvidia-525xx-utils 525.147.05-1
lib32-nvidia-cg-toolkit 3.1-10
lib32-opencl-nvidia 565.77-1
nvidia-525xx-dkms 525.147.05-5
nvidia-525xx-settings 525.147.05-1
nvidia-525xx-utils 525.147.05-5
nvidia-cg-toolkit 3.1-8
nvidia-prime 1.0-5
Newer versions of the driver trigger a freeze of the screen for me in games. The game doesnāt crash. Mouse pointer is still there. I can hear the games sound. But the screen is not refreshed anymore. Happens after a few minutes into many games.
More info on my system, maybe it helps to debug:
Graphics:
inxi -G
Graphics:
Device-1: NVIDIA AD107M [GeForce RTX 4060 Max-Q / Mobile] driver: nvidia
v: 525.147.05
Device-2: Advanced Micro Devices [AMD/ATI] Phoenix3 driver: amdgpu
v: kernel
Device-3: Kingcome FHD WebCam driver: uvcvideo type: USB
Display: x11 server: X.Org v: 21.1.15 with: Xwayland v: 24.1.4 driver: X:
loaded: modesetting unloaded: vesa dri: radeonsi gpu: amdgpu resolution:
1: 3840x2160~60Hz 2: 2560x1600~240Hz
API: EGL v: 1.5 drivers: kms_swrast,nvidia,radeonsi,swrast
platforms: gbm,x11,surfaceless,device
API: OpenGL v: 4.6.0 compat-v: 4.5 vendor: amd mesa v: 24.3.3-arch1.2
renderer: AMD Radeon Graphics (radeonsi gfx1103_r1 LLVM 19.1.6 DRM 3.59
6.12.9-arch1-1)
API: Vulkan v: 1.4.303 drivers: N/A surfaces: xcb,xlib
Info: Tools: api: eglinfo, glxinfo, vulkaninfo
de: kscreen-doctor,xfce4-display-settings gpu: corectrl, gputop,
intel_gpu_top, lsgpu, nvidia-settings, nvidia-smi x11: xdriinfo,
xdpyinfo, xprop, xrandr
Kernel:
uname -a
Linux ww 6.12.9-arch1-1 #1 SMP PREEMPT_DYNAMIC Fri, 10 Jan 2025 00:39:41 +0000 x86_64 GNU/Linux
Nvidia SMI:
nvidia-smi
Sat Jan 18 02:13:20 2025
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.147.05 Driver Version: 525.147.05 CUDA Version: 12.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:01:00.0 Off | N/A |
| N/A 39C P0 N/A / 115W | 1MiB / 8188MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
Would be nice to see a fix. Hard to believe this bug has been around for such a long time.
And i forgot in the last post: These are my screens (DP-1 is external, eDP-1 is notebook screen):
xrandr | grep connected
eDP-1 connected (normal left inverted right x axis y axis)
DP-1 connected primary 1920x1080+0+0 (normal left inverted right x axis y axis) 698mm x 393mm