External monitor freezes when using dedicated GPU

simpson38 · December 16, 2024, 9:29pm

Still freezes for me with 565.77 X11 KDE. @amrits what gives?

zac21 · December 17, 2024, 3:56am

Do you folks have Prime installed? I was having issues where stuff on the dGPU was freezing under load if I had the iGPU enabled and didn’t have Prime installed. Once I installed Prime, the freezes stopped.

dmakc · December 20, 2024, 7:58am

What is “Prime”? I can’t find such package for Debian/Ubuntu.

byme · December 20, 2024, 6:37pm

It should be included in drivers now. You can run it with nvidia-settings command - it starts gui and you can switch prime profiles there.

dmakc · December 20, 2024, 6:54pm

Ok, but there are no default profiles:

Where could I get proper profiles and rules to avoid freezes?

byme · December 20, 2024, 7:03pm

No idea, this is what it looks like by default on my new lenovo legion

byme · December 20, 2024, 7:05pm

Now I see, you dont even have a Prime Profiles menu option. To me it doesnt look like its supported on you computer.

zac21 · December 20, 2024, 8:02pm

I think in Debian/Ubuntu the package is called nvidia-prime.

sudo apt-get install nvidia-prime

dmakc · December 21, 2024, 7:59am

There is no such package for Debian, but I’ve tried to install Ubuntu’s one. And nothing changed.

According to the xrandr output I’ve got dGPU:

$ xrandr --listproviders 
Providers: number : 2
Provider 0: id: 0x45 cap: 0xf, Source Output, Sink Output, Source Offload, Sink Offload crtcs: 4 outputs: 3 associated providers: 1 name:modesetting
Provider 1: id: 0x270 cap: 0x2, Sink Output crtcs: 4 outputs: 1 associated providers: 1 name:NVIDIA-G0

And according to glxinfo I’ve got two different render devices:

$ __GL_SYNC_TO_VBLANK=0 __NV_PRIME_RENDER_OFFLOAD=1 __GLX_VENDOR_LIBRARY_NAME=nvidia __VK_LAYER_NV_optimus=NVIDIA_only glxinfo |head -n5
name of display: :0
display: :0  screen: 0
direct rendering: Yes
server glx vendor string: NVIDIA Corporation
server glx version string: 1.4
$ glxinfo |head -n5
name of display: :0
display: :0  screen: 0
direct rendering: Yes
server glx vendor string: SGI
server glx version string: 1.4

So I definitely have two different GPUs in my laptop and PRIME is supported.

byme · December 21, 2024, 8:49am

I have got ubuntu as well and it came whem I selected proprietary deivers during installation of ubuntu. Maybe thats the difference thatd do the whole setup for you. However you should be able to select that somewhere in update center and enable proprietary and maybe it will update? Dunno. But how do you know prime is supported, do you have a mux switch? For example I do. But Im pretty unsure about that prime itself fixes it anyway, but you can try… Btw I use wayland too.

dmakc · December 21, 2024, 9:19am

PRIME doesn’t use mux switch. It uses shared memory for rendering. An I can check that dGPU is used for rendering f.e. through nvidia-smi:

$ nvidia-smi 
Sat Dec 21 12:17:27 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.142                Driver Version: 550.142        CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3050 ...    Off |   00000000:01:00.0  On |                  N/A |
| N/A   47C    P5              5W /   60W |     302MiB /   4096MiB |      2%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A      1399      G   /usr/lib/xorg/Xorg                            115MiB |
|    0   N/A  N/A      6880      G   /usr/lib/vmware/bin/mksSandbox                177MiB |
+-----------------------------------------------------------------------------------------+

If I’ve run vmware in PRIME offload mode I’ve got string with /usr/lib/vmware/bin/mksSandbox telling us, that VMware uses NVidia dGPU. What it could be if not PRIME?

zac21 · December 21, 2024, 6:12pm

Ya, you don’t need a mux for prime. I was having GPU crashes on my desktop with iGPU on the CPU driving 2 monitors and an NV card driving the other 4. Putting the dGPU under stress caused the driver to crash. Installing Prime to mediate the shared resources between the two GPUs solved the crashes for me.

stephematician · December 22, 2024, 8:53pm

This has come up many times; it’s either not a solution in all cases (i.e. freezes have still been observed) or it’s not a solution because it locks in high power draw on a laptop.

stephematician · December 22, 2024, 9:13pm

I think nvidia-prime is just a tool for managing profiles; it’s not necessary for PRIME Render Offload on X (Chapter 35. PRIME Render Offload). Wayland is an entirely different story (I don’t think it even uses PRIME?).

felipe42x · December 23, 2024, 2:08pm

I’m also affected by this bug. I’m not using the dedicated GPU; I’m using the On-Demand mode. Using the performance mode causes performance issues with GNOME Shell (under both Wayland and Xorg). Regarding this issue in particular:

Host: Lenovo Legion Pro 5 16ARX8
Kernel: Linux 6.11.0-13-generic
Resolution: 2560 x1600 240 hz
Resolution external monitor (HDMI): 1920 x1080, 74.97 hz
SO: Ubuntu 24.10
DE: GNOME 47
WM: X11
CPU: AMD Ryzen 9 7945HX with Radeon Graphics × 32
GPU: AMD Radeon 610M
GPU: NVIDIA GeForce RTX 4070 Laptop GPU
Memory: 32,0 GiB
Driver version: 560.35.03

The issue almost instantly occurs when resizing the glxgears window.
Lowering the refresh rate of the external monitor to 60 Hz works for me. So far, it hasn’t failed, even when trying to reproduce the error using vkcube and overlapping glxgears.

Update: No, the bug still occurs even with 60 Hz. It’s harder to reproduce, but I had YouTube on one screen while using Steam with my discrete card on the main screen, and it froze. Also, using Firefox + vkcube + glxgears causes the secondary screen to freeze as well.

shivampatel887 · January 1, 2025, 4:58pm

i have been facing this issue for over a year now , any fix guys ??? , help a poor soul

dmakc · January 2, 2025, 5:59pm

I’ve made some investigations on 2-second freezing while resizing glxgears using 565.77 open kernel module drivers and this freezes are directly related to the internal timeout - open-gpu-kernel-modules/src/nvidia-modeset/src/nvkms.c at 9d0b0414a5304c3679c5db9d44d2afba8e58cc1b · NVIDIA/open-gpu-kernel-modules · GitHub in IdleBaseChannelAll function:

/*!
 * Idle all requested heads.
 *
 * First, wait for the heads to idle naturally.  If a timeout is exceeded, then
 * force the non-idle heads to idle, and record these in pReply.
 */
static NvBool IdleBaseChannelAll(
    NVDevEvoPtr pDevEvo,
    const struct NvKmsIdleBaseChannelRequest *pRequest,
    struct NvKmsIdleBaseChannelReply *pReply)
{
    NvU64 startTime = 0;

    /*
     * Each element in subDevicesPerHead[] must be large enough to hold one bit
     * per subdevice.
     */
    ct_assert(NVKMS_MAX_SUBDEVICES <=
              (sizeof(pRequest->subDevicesPerHead[0]) * 8));

    /* Loop until all head,sd pairs are idle, or we time out. */
    do {
        const NvU32 timeout = 2000000; /* 2 seconds */


        /*
         * Clear the pReply data,
         * IdleBaseChannelCheckIdle() will fill it afresh.
         */
        nvkms_memset(pReply, 0, sizeof(*pReply));

Matching linux kernel function_graph trace:

# tracer: function_graph
#
# function_graph latency trace v1.1.5 on 6.9.12
# --------------------------------------------------------------------
# latency: 0 us, #3/3, CPU#0 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:20)
#    -----------------
#    | task: -0 (uid:0 nice:0 policy:0 rt_prio:0)
#    -----------------
#
#                       _-----=> irqs-off
#                      / _----=> need-resched
#                     | / _---=> hardirq/softirq
#                     || / _--=> preempt-depth
#                     ||| /
# CPU  TASK/PID       ||||     DURATION                  FUNCTION CALLS
# |     |    |        ||||      |   |                     |   |   |   |
  6)   Xorg-1464    |  ..... | $ 2000502 us  |  } /* nvkms_unlocked_ioctl [nvidia_modeset] */
  8)  InputTh-1866  |  ..... | $ 1998402 us  |  } /* nvkms_unlocked_ioctl [nvidia_modeset] */
  7)  nvidia--870   |  ..... | $ 1997807 us  |  } /* nvkms_kthread_q_callback [nvidia_modeset] */

As you could see, Xorg calls nvidia_modeset ioctl through NVidia’s Xorg driver and it freezes for almost 2 seconds.

I’ve changed this value to 10 ms (10000) rebuilt module and glxgears began resizing flawlessly … until my external monitor totally freezed. So I thing that there are two different issues in the NVidia driver: first of them is caused by waiting for rendering queue to become idle and second is undetectable at this moment. Linux kernel function_graph trace subsystem can’t help to detect this issue or may be I don’t know how to setup it properly. :-(

PS: I can’t understand why NVidia’s developers use 2 second timeout in such case. As for me it is almost equivalent to eternity regarding to the monitor frame rate.

sad_burger · January 15, 2025, 9:06pm

For what it’s worth, I’ve discovered a workaround. If I boot into BIOS and change the option to “Dedicated Graphics” from “Hybrid Graphics”, the problem no longer occurs. Obviously, this is far from ideal, as the dGPU never powers down while in this setting and chews up battery, but it’s better than nothing. This is on a Lenovo LOQ 15 with RTX4050 and Ryzen7 8845HS.

wowo · January 18, 2025, 1:17am

I have the same problem on Arch linux. I tried many different driver version / kernel version combinations over two days. Nothing worked.

My workaround at the moment is to go back to proprietary driver version 525.147.05.
No problems at all with this version.

Old version driver is provided in AUR for Arch Linux:
https://aur.archlinux.org/packages?O=0&K=nvidia-525

Installed packages for a working setup:

pacman -Q | grep nvidia
lib32-nvidia-525xx-utils 525.147.05-1
lib32-nvidia-cg-toolkit 3.1-10
lib32-opencl-nvidia 565.77-1
nvidia-525xx-dkms 525.147.05-5
nvidia-525xx-settings 525.147.05-1
nvidia-525xx-utils 525.147.05-5
nvidia-cg-toolkit 3.1-8
nvidia-prime 1.0-5

Newer versions of the driver trigger a freeze of the screen for me in games. The game doesn’t crash. Mouse pointer is still there. I can hear the games sound. But the screen is not refreshed anymore. Happens after a few minutes into many games.

More info on my system, maybe it helps to debug:

Graphics:

inxi -G
Graphics:
  Device-1: NVIDIA AD107M [GeForce RTX 4060 Max-Q / Mobile] driver: nvidia
    v: 525.147.05
  Device-2: Advanced Micro Devices [AMD/ATI] Phoenix3 driver: amdgpu
    v: kernel
  Device-3: Kingcome FHD WebCam driver: uvcvideo type: USB
  Display: x11 server: X.Org v: 21.1.15 with: Xwayland v: 24.1.4 driver: X:
    loaded: modesetting unloaded: vesa dri: radeonsi gpu: amdgpu resolution:
    1: 3840x2160~60Hz 2: 2560x1600~240Hz
  API: EGL v: 1.5 drivers: kms_swrast,nvidia,radeonsi,swrast
    platforms: gbm,x11,surfaceless,device
  API: OpenGL v: 4.6.0 compat-v: 4.5 vendor: amd mesa v: 24.3.3-arch1.2
    renderer: AMD Radeon Graphics (radeonsi gfx1103_r1 LLVM 19.1.6 DRM 3.59
    6.12.9-arch1-1)
  API: Vulkan v: 1.4.303 drivers: N/A surfaces: xcb,xlib
  Info: Tools: api: eglinfo, glxinfo, vulkaninfo
    de: kscreen-doctor,xfce4-display-settings gpu: corectrl, gputop,
    intel_gpu_top, lsgpu, nvidia-settings, nvidia-smi x11: xdriinfo,
    xdpyinfo, xprop, xrandr

Kernel:

uname -a
Linux ww 6.12.9-arch1-1 #1 SMP PREEMPT_DYNAMIC Fri, 10 Jan 2025 00:39:41 +0000 x86_64 GNU/Linux

Nvidia SMI:

nvidia-smi 
Sat Jan 18 02:13:20 2025       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.147.05   Driver Version: 525.147.05   CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0 Off |                  N/A |
| N/A   39C    P0    N/A / 115W |      1MiB /  8188MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Would be nice to see a fix. Hard to believe this bug has been around for such a long time.

wowo · January 18, 2025, 1:26pm

And i forgot in the last post: These are my screens (DP-1 is external, eDP-1 is notebook screen):

xrandr | grep connected
eDP-1 connected (normal left inverted right x axis y axis)
DP-1 connected primary 1920x1080+0+0 (normal left inverted right x axis y axis) 698mm x 393mm

Topic		Replies	Views
[530.41.03] External monitor stays frozen until I move my mouse Linux	53	7485	November 1, 2024
White noise/static with Quadro T2000 mobile and external monitor Linux	74	6866	October 24, 2022
High CPU usage on xorg when the external monitor is plugged in Linux	120	38664	June 21, 2023
Nvidia, please get it together with external monitors on Wayland Linux wayland , linux , linux-driver	62	13451	May 30, 2025
Black screen after prime-select nvidia and log out using v346.35 drivers Linux	32	56745	May 19, 2016
nvidia-xconfig doesnt do what i want it to, nor does nvidia-settings Linux	110	84061	October 12, 2021
Sluggish Performance/no Reclocking (Ubuntu 17.04, Kernel 4.12RC2, Nvidia Quadro M2200, Driver 381.22... Linux	49	9710	October 14, 2021
525.89 brings back Thunderbolt 3 connected displays flicker and suspend issues Linux	34	3179	April 10, 2024
[525.89.02] Flickering when using two screens at 120Hz on 4090 with latest driver Linux	50	12260	November 28, 2023
Reproducible: NVRM: GPU at 0000:01:00.0 has fallen off the bus. -- Both screens black, Xorg at 100% Linux	24	51032	December 16, 2015

External monitor freezes when using dedicated GPU

Related topics