Black or incorrect textures in KDE

Yeah, I didn’t want to hijack your thread with the vsync thing. It’s just that I was suspecting that you are using some specific hacks (like everyone else does, since Nvidia seems entirely incompetent to design their driver to work correctly with any X11 compositor…).
I’m very certain that I don’t suffer any corruption issues (I never saw anything turning black after toggling off compositing.) apart from that standby thing, so I guess it’s not a general problem and perhaps there’s even a connection to your specific vsync hacks.
Enforcing TB btw. leads reproducibly to stuttering here when moving windows.

As far as we are aware, KWin isn’t being tested by its developers on NVIDIA hardware, despite our offers to provide hardware for free. This means that some KWin bugs aren’t caught in time, or at all, and that we never get proper bug reports from KWin developers when they feel that our driver, not KWin, is at fault for a particular problem.

For example, if GLX_EXT_buffer_age is claimed not to work reliably on the NVIDIA driver, we’re eager to hear about that and fix it, but we need a real bug report for that, that exposes the problem.

As for the Xid 31 reports with KWin, we’re investigating, but we’re still looking for a reliable way to reproduce these issues. We haven’t managed to observe the problem locally so far, so any reliable recipe for reproduction we’d like to hear about. It isn’t known at this point if the issue is a NVIDIA or KWin bug.

Thanks

Maybe fix X11 performance in general first?
Example: Take any desktop environment with or without any OGL window compositing. Play a video with any player in a window. Open a new window with any program, e.g. open your texteditor or file browser.
-> video stutters badly during that because Xorg performance drops.

This does only happen with proprietary nvidia drivers, even Nouveau doesn’t show this behavior.
This does also happen with very small window layers like overlay tooltip pop ups in programs etc.
SO ANNOYING!!

@ahuillet:

Thanks for the reply. At least, now I know Nvidia is aware of the problem.

Are you able to reproduce the black textures problem the way I described?

For example, if GLX_EXT_buffer_age is claimed not to work reliably on the NVIDIA driver, we’re eager to hear about that and fix it, but we need a real bug report for that, that exposes the problem.

Yes, I would love this problem to be fixed as well. Unfortunately, I don’t know any reliable way to reproduce the problem, other than use KDE with KWin with my settings above, except KWIN_USE_BUFFER_AGE, on a daily basis. I can also reproduce the occasional texture flickering if I replace KWin with xfwm4+compton, where compton performs compositing with GLX_EXT_buffer_age (in ~/.config/compton.conf, there should be parameter glx-swap-method = “-1”;). The problem is that flickering does not show right away and may appear after an hour or a day or a few days of system use. Do you have any suggestions how I can help with this problem?

For the Xid 31 issues, in my experience launching certain OGL apps, Steam chief amoung them, seems to be the trigger. After a fresh reboot, closing and reopening Steam a few times is usually enough to trigger it. It is caused less often by launching other OGL apps, such as fullscreen games.

My setup:

  • GTX 1080, kwin 5.11.3, nvidia 387.22
  • Disable Vsync in nvidia-settings, enable allow flipping. G-Sync enabled, but multiple monitors in use, so effectively disabled.
  • Set KDE compositing to "Automatic" tear prevention, OpenGL 3.1
  • KWIN_TRIPLE_BUFFER=1 kwin_x11 --replace
  • Bind a key in global shortcuts -> custom shortcuts to the above (KWIN_TRIPLE_BUFFER=1 kwin_x11 --replace) if you value your sanity, as every time this happens the display stops updating until kwin is relaunched. VT switching to get a console to kill kwin is risky - after doing this a few times vsync fails (like, forever, until a reboot(?)) and framerate drops everywhere. Additionally VT switching seems to give a ~10% chance that the driver never comes back.

At this point, launch steam, login, exit steam, repeat a few times. I’ve never survived on a fresh reboot across two different systems (sharing in that they have a 1080) longer than a few launches of steam. It seems to get worse as time goes on, until any opengl app starting up will trigger the error, making above keybind a necessity.

(While we’re on the subject, kwin claims vsync doesn’t function correctly unless you set __GL_YIELD=usleep or use their triple buffering under the nvidia drivers, no idea if it is relevant to this issue)

BTW, I’ve found the KDE bug that recommended the KWIN_USE_BUFFER_AGE workaround:

https://bugs.kde.org/show_bug.cgi?id=363500

It also contains a video showing the problem. I’m posting it here just in case someone is able to salvage any technical hints from it (or the bug it is linked as a duplicate of).

The claim that __GL_YIELD=usleep is required points at an application bug, possibly a race condition due to missing synchronization.

I’ve tried both Nephyrin’s and Lastique’s instructions. On a Archlinux updated today (Kwin 5.11.3), I’ve opened many different windows, one maximized in the background, and resized windows without seeing any black flicker or fully black window.
I’ve started Steam and switched between tabs, then restarted Steam about ten times, without observing either a black window or a Xid error.

NVIDIA driver 387.22 on Geforce GTX 770.

@ahuillet:

I tried building Kwin 5.11.3 from sources on my Kubuntu 17.10 and can still reproduce the black textures. Did you configure the environment variables as I described?

I didn’t think any environment variable was needed to reproduce the problem, only to solve a separate issue related to VSync?

The environment variables work around multiple issues, one of them is Vsync. I don’t know the source of the problem, so I cannot tell if any of the variables are essential. But I can tell that I can reproduce the problem with these variables, so for now I consider all of them essential. You may need to also set up xorg.conf. I have only /etc/X11/xorg.conf.d/20-nvidia.conf with the following content:

Section "Device"
    Identifier "Default nvidia Device"
    Driver "nvidia"
    Option "NoLogo" "True"
    Option "CoolBits" "12"
    Option "TripleBuffer" "True"
EndSection

FYI It’s not just Kwin that suffers from the buffer_age issue. Enlightenment does too. We’ve used buffer age on both EGL and GLX drivers for a long time and at least the commercial EGL drivers for ARM systems (Mali, Imgtec) that supported this worked fine. It’s the exact same rectangle update/history tracking logic for both paths, but I notice that sometimes we get an “old frame that is different to what buffer age says it is” and this results in some parts of the screen going into flicker-fest every time any update happens until that area of the screen is redrawn. Indeed the workaround is to force “full updates” in the settings panel.

Trying to figure out if it is the driver or not is really hard because we basically would have to keep a history log of the last N frames of backbuffers (literally read all the pixels and store them) before and after render, and if you “see the bug” then dump them all out and hope history went back far enough. That means N needs to be non-trivial (like maybe 100-500 buffers so we could go back a few seconds in time to the issue if we dump via some hotkey). It’s kind of a nasty thing to run all day on your desktop just to catch when this happens maybe 3-5 times per day on average.

If the nvidia driver had more probes and ways of digging into its logic… I’d be doing just that to maybe if source existed, add things like frame display counts to each back buffer (some monotonic increasing integer for each frame swapped) and then have some logic to check buffer age matches this counter correctly or something. Cheaper than entire 5120x1440 buffer reads 2x per frame… :)

If nvidia have some way to help get more info like this out of a driver (perhaps have an engineering/debug build/version) it’d be really helpful. The fact both Kwin and Enlightenment suffer indicates the bug MIGHT be in the driver. The logic for buffer_age is actually in EFL so also applications would get affected if the compositor were not redirecting - this is the case on mobile and TV etc. environments where apps spend most of their life undirected, and the same buffer_age logic works as above.

Well, i’ve had texture corruption in civilization 5 after switching to a text vt and back to the xorg one.
I’ve had the same with portal and half life 2.
If the issue is a kwin fault, then it is a Civilization 5 fault and source engine fault, and god only knows how many could be added to the list.

Or maybe it is just nvidia, who knows (meh).

xrender is gpu accelerated since 10 years:
https://www.phoronix.com/scan.php?page=article&item=934&num=2

And paired with forcefullcompositingpipeline and some quirks in kwinrc (MaxFPS=60 or whatever), is a nice alternative to opengl compositing.
But it just offers translucency, so effects like magic lamp, blur, cube, wobbly will not work.

Exactly, XRender basically offers just alpha blending. Any other effects or drawing will have to be done by the CPU. That is opposed to OpenGL where you have shaders at your disposal.

BTW, ForceCompositionPipeline is broken:

https://devtalk.nvidia.com/default/topic/1029159/linux/forcecompositionpipeline-causes-hard-lockups/

So tearing will be a problem. It’s a non-starter for me.

As I understand it, switching to VT wipes GPU memory, so every application has to support Nvidia extension to receive notification when that happens and redraw everything. Kwin supports that in its latest releases, but I doubt any games do.

Anyway, this is NOT what this thread discusses. This discussion is about black/incorrect textures when you don’t switch to VT or suspend.

On the topic of buffer_age, I found that in KDE I often see flickering textures when I’m coding in QtCreator. It often displays tooltips, which morph and resize as you type the code. Sometimes you can see remnants of those animations flicker on the editor view.

Hi,

Just in case it can help others. I solved almost all the pixel corruption problems on KDE with a simple configuration change. My environment is KDE5 + Plasma + NVidia.

At your home, find the file .config/plasmashellrc and add the following lines:

[QtQuickRendererSettings]
GraphicsResetNotifications=true

Also at your home, find the file .config/kwinrc and add the following lines:

[QtQuickRendererSettings]
GraphicsResetNotifications=true

With that configuration, all the KDE problems related with pixel corruption were solved. The only problem that persists in my system, is a Chromium specific problem, which has a pixel corruption after resume from suspend. But it’s not a big problem, since the pixel corruption is fixed after minimize the Chromium window. I don’t even need to close the application.

If I’m not wrong, the flag GraphicsResetNotifications=true makes Plasma and KWin to handle the NV_robustness_video_memory_purge event in order to rebuild the volatile memory (such as framebuffer objects) when it’s needed.

Hope it helps,
Regards.

Note that the GraphicsResetNotifications workaround is only available since Qt 5.12, preferably even later. Reference KDE bug: https://bugs.kde.org/show_bug.cgi?id=364766

Thak you very much, this solved my problem too.

I have to revive this topic because now people (including myself) are seeing black textures in Firefox, which are possibly caused by EGL_EXT_buffer_age (a EGL equivalent of GLX_EXT_buffer_age). There is this Firefox bug:

The bug contains a video with the problem demonstration, and a workaround, which is to essentially disable buffer age tracking (and, presumably, the use of EGL_EXT_buffer_age).

Although the bug above focuses on compositing disabled, the problem happens with compositing enabled as well (1735784 - Firefox window contents occasionally black or inconsistent).

Nvidia, please fix EXT_buffer_age or at least don’t expose it from the driver.