At seemingly random points in time kwin_x11 freezes the desktop for several seconds, and after that it unfreezes with a notification that compositing was restarted. In the system logs I can see that there was a page allocation failure in the kernel module:
[87398.720448] kwin_x11: page allocation failure: order:4, mode:0x40cc0(GFP_KERNEL|__GFP_COMP), nodemask=(null),cpuset=/,mems_allowed=0
[87398.720453] CPU: 4 PID: 1988 Comm: kwin_x11 Tainted: P OE 5.4.0-48-lowlatency #52-Ubuntu
[87398.720453] Hardware name: System manufacturer System Product Name/P8Z68-V PRO, BIOS 3603 11/09/2012
[87398.720454] Call Trace:
[87398.720460] dump_stack+0x6d/0x9a
[87398.720462] warn_alloc.cold+0x7b/0xdf
[87398.720464] __alloc_pages_slowpath+0xe34/0xe80
[87398.720467] ? compact_zone_order+0xbb/0xf0
[87398.720468] ? get_page_from_freelist+0x233/0x390
[87398.720470] __alloc_pages_nodemask+0x2d2/0x320
[87398.720471] alloc_pages_current+0x87/0xe0
[87398.720472] kmalloc_order+0x1f/0x80
[87398.720473] kmalloc_order_trace+0x24/0xc0
[87398.720474] __kmalloc+0x228/0x280
[87398.720490] nvkms_alloc+0x24/0x60 [nvidia_modeset]
[87398.720499] _nv002714kms+0x16/0x30 [nvidia_modeset]
[87398.720501] WARNING: kernel stack frame pointer at 00000000ffcb012a in kwin_x11:1988 has bad value 0000000000000000
I have seen this happen at different times - under CPU load and mostly idle, with a lot of free host memory available.
Note that unlike the problem described in the other topic (440.48.02: Random X.org lock ups due to kernel module crash), in this case it doesn’t happen on waking up or going to sleep (DPMS), and it happens with HardDPMS=False. It also affects KWin instead of Xorg.
This started happening after the upgrade to 455.23.04. I didn’t have this problem with 450 series.
One observation I made is that this problem is more likely to reproduce when there is a lot of filesystem IO is happening. For example, when a large (multi-gigabyte) directory with lots of files (tens/hundreds of thousands) is being copied between partitions on an SSD.
And yes, the problem probably occurs faster when there is lots of disk IO. I run ZFS on two SSDs and three 2TB drives so when I try to launch any game… it just freezes right off the bat!
Even staying idle at the desktop will make it freeze in under 15 minutes, usually much less!
I’m having these, too. nvidia-modeset/: page allocation failure, not kwin or xorg as others experienced. I was moving files from an external HDD to another when suddenly at least everything visual froze, music did stop too though. Doing Ctrl+alt+f2 and then f1 did help a couple of times, but after the third or fourth time it didn’t. My monitors reported no signal and I had to reset the PC yet again.
Here’s (yet another) kernel log of the problem.
The second one is from after I was able to recover a few times through ctrl+alt+f1/f2 switching. It happened the moment I was trying to start an application (Discord). Looks like x11 completely broke down then.
Wanted to chip in here. I believe I am experiencing the same issue. I am attaching the log info here. For addition, my system is a 9900k with 32gb, RTX 2080, with drivers 450.80.02-0ubuntu0.20.04.2.X11pagefault.txt (26.0 KB)
I also just experienced what seems to be the same issue while running chromium. I have RTX 2070 SUPER on Arch Linux, with driver 455.28-7 and GNOME DE.
I experienced a complete freeze on my desktop, and the cursor disappeared. Like bogus12, repeatedly trying to switch ttys eventually fixed the issue, and I was able to to resume work on chromium. This has so far happened only once for me though.
I have had this error for maybe 2-3 weeks now too. During this time i have been using the latest Arch Linux 5.8.X kernels and the “top of the line newest” nvidia drivers (at the moment 455.28-1). In most cases the freeze bug happens while using web browsers (chromium, firefox, vivalid all are affected) and in some other cases while doing some mouse interactions.
I can confirm that CTRL-ALT-F2 switching to console and back sometimes gets my desktop(using i3wm) unfrozen . In some cases i have to “kill -9” some process that seems to be side effected by the freeze. Sometime it’s the browsers sometimes it’s polybar or emacs-daemon.
I tried it with and without profile-sync-daemon to see if that help’s with the browser and problem. As someone above mentioned it being connected to high IO load. But that changed nothing.
Even with my system only having 8GB at the moment the bug happens even if most of these 8GB are free.
It should apply to all memory allocation failures that happen during mode setting operations. I’m not 100% sure it applies to the one in that other thread, but I think so.
Seeing the exact same thing with 455.28 and an RTX 2070 Super on Ubuntu 20.10 (kernel 5.8.0-23-generic) running plasma desktop. I wasn’t seeing this issue at all with the 450 train, but it happens frequently after moving to 455.28.
Oct 21 19:27:22 H510 kernel: [441965.951840] warn_alloc: 3 callbacks suppressed
Oct 21 19:27:22 H510 kernel: [441965.951842] kwin_x11: page allocation failure: order:4, mode:0x40cc0(GFP_KERNEL|__GFP_COMP), nodemask=(null),cpuset=/,mems_allowed=0
Oct 21 19:27:22 H510 kernel: [441965.951847] CPU: 14 PID: 332304 Comm: kwin_x11 Tainted: P W OE 5.8.0-23-generic #24-Ubuntu
This bug brings down our systems in one way or another daily (requiring force shutoff) along with service disruptions more often; can you confirm what driver series is appropriate to downgrade to?