On my system, the Nvidia 515 release X11 driver keeps busy polling the Linux kernel for clock_gettime (through libc) in a tight, busy loop:
On an otherwise idle system, this consumes up to 40%
This problem does not seem to be new; it has been reported in High CPU usage on xorg when the external monitor is plugged in before, but the follow-up postings seem to have watered down the very nice technical starting posting. I am therefore creating this very specific fresh topic.
What you see above is a nice rendition of the Xorg process created by GitHub - janestreet/magic-trace: magic-trace collects and displays high-resolution traces of what a process is doing - this is simply perf
, but allowing for a friendlier presentation in the form of a flamegraph (sudo magic-trace attach -pid $PID_OF_XORG
- the output can then be rendered in multiple ways, I have a strictly local server running for this, simple to do)
This nice rendition shows that on my idle Tiger Lake-H 8 core system, almost all of the CPU is consumed by Xorg, and there by what appears to be a tight loop inside the (closed source) nvidia_drv, the Nividia X11 driver module.
This RTX 3060 Optimus notebook is running Fedora 36, latest kernel, latest Mesa, latest KDE, latest X, on top of
- Intel GPU serves (only) the internal display (and HDMI)
- Nvidia GPU serves (only) the USB-C output (via DisplayPort Alternate Mode to a DisplayPort)
- Intel is the primary GPU, Nvidia is offloading
Notebook screen: 3072x1920 @ 60.14 Hz
External screen: 3840 x 12160 @ 60.00 Hz (no G-Sync)
This problems gets reduced a little bit by forcing the GPU to prefer maximum performance; this clocks up everything although I seem to be only consuming bandwidth for memory transfer to the Nvidia controlled connector / crtc. And clocking up everything creates loads of heat and noise.
A good way to demonstrate this problem on my notebook is to run
sudo nvidia-smi --reset-memory-clocks
sudo nvidia-smi --lock-memory-clocks=100,100
to force the GPU into a lower power state (donāt worry about the 100,100 - apparently nvidia-smi will auto-correct that). Things do get a little but laggy, but suddenly CPU consumption is even higher, 50+%, and many many more clock_gettime calls (in a busy loop).
So looking at this from the outside, there is a strong correlation between low memory speed on the GPU and (very high) CPU utilization from busy polling the Linux kernel clock_gettime.
But why, and how can this be stopped please?
I only want the Nvidia driver to show that framebuffer content it was handed (crtc ā port); in PRIME offload, it doesnāt produce anything on top of that, unless I tell it to do so.