Some window titles and terminal outputs flashes with picom glx backend

While open some windows (frequently xfce4-terminal) application’s title menu bar some times is flashing:

External Media

Also i have this bug with evince-3.34.2. while scrolling pdf with embedded images:

External Media

Gentoo x86_64, GF 1650 Super 4 Gb (and before GF 1050 2 Gb), nvidia-drivers-440.82 (and earlier).

Openbox, latest picom from git. Picom backend=glx or experimental-backend. With backend=xrender there is no problem.

[Some window titles and terminal outputs flashes · Issue #268 · yshui/picom · GitHub](http://This bug on github yshui/picom.)

mesa-19.3.5 (X classic dri3 egl gbm gles2 libglvnd -d3d9 -debug -gallium -gles1 -llvm -lm-sensors -opencl -osmesa -pax_kernel -selinux -test -unwind -vaapi -valgrind -vdpau -vulkan -vulkan-overlay -wayland -xa -xvmc ABI_X86=“64 -32 -x32” VIDEO_CARDS=“-freedreno -i915 -i965 -intel -iris -lima -nouveau -panfrost -r100 -r200 -r300 -r600 -radeon -radeonsi -vc4 -virgl -vivante -vmware”)

xorg-server-1.20.7 (elogind libglvnd suid udev xorg xvfb -debug -dmx -doc -ipv6 -kdrive -libressl -minimal -selinux -static-libs -systemd -unwind -wayland -xcsecurity -xephyr -xnest)

Kernel-5.4.28.

$ LIBGL_DEBUG=verbose glxinfo -B
name of display: :0
display: :0 screen: 0
direct rendering: Yes
Memory info (GL_NVX_gpu_memory_info):
Dedicated video memory: 4096 MB
Total available memory: 4096 MB
Currently available dedicated video memory: 3592 MB
OpenGL vendor string: NVIDIA Corporation
OpenGL renderer string: GeForce GTX 1650 SUPER/PCIe/SSE2
OpenGL core profile version string: 4.6.0 NVIDIA 440.82
OpenGL core profile shading language version string: 4.60 NVIDIA
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile

OpenGL version string: 4.6.0 NVIDIA 440.82
OpenGL shading language version string: 4.60 NVIDIA
OpenGL context flags: (none)
OpenGL profile mask: (none)

OpenGL ES profile version string: OpenGL ES 3.2 NVIDIA 440.82
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.20

nvidia-bug-report.log.gz.log (205.1 KB)

This sounds like a classic race condition for OpenGL-based compositors: the X server sends Damage extension events as soon as it has queued rendering that will result in a change to pixels on the screen. However, it does not wait for that rendering to actually finish before sending the event.

If a compositor receives a Damage event and responds by sending rendering requests (e.g. using the XRender extension) back to the X server, then ordering on the GPU is guaranteed: the rendering that triggered the Damage event will happen before the rendering requested by the compositor. However, if the compositor is using direct rendering (e.g. GLX, EGL, or Vulkan) then there is no ordering guarantee: the rendering requested by the compositor can happen immediately, including reading from the window that the X server is about to render into.

The solution to this is to enforce the correct ordering by creating XSyncFence objects, importing them into OpenGL using the glImportSyncEXT function, and then telling OpenGL to wait for the X server’s rendering work to finish before reading window contents. I took a quick look at the picom source and it doesn’t seem to use these synchronization primitives, although I didn’t dig into the source in depth.

Using fences correctly is a little tricky. Please refer to src/compositor/meta-sync-ring.c in Mutter for an example of how to use and reset fences to fix this problem.

That was my initial theory too. picom already has fix for this problem, which waits for a xsyncfence before it renders each frame. (https://github.com/yshui/picom/blob/0b377537ec9c3f6faaa13878701d8d0b2ee62d0c/src/x.c#L519) This is not as efficient as using a fence ring with EXT_x11_sync_object, but is correct nonetheless.

the reporter has already tried enabling that option, but this problem persists.

Hmm, that’s a good point. It looks like the code forces that option on whenever it detects NVIDIA, so it’s not surprising that trying to change it didn’t show any effect for the user.

I can’t reproduce the problem here, so I won’t be able to investigate it myself.

Did you try to match the configuration of the reporter? Do you need more information about their setup?

We would really appreciate if you could investigate further, because there doesn’t seem to be much I could do without insight into the driver itself.

The reporter mentioned this problem doesn’t show up if picom is run under apitrace.