ForceCompositionPipeline causes hard lockups

I have two displays attached to GTX980. When I enable ForceCompositionPipeline option in xorg.conf for both of them and reboot, I can see full system lockups at random times. The UI freezes, mouse isn’t moving, no reaction to keypresses, even SysReq+REISUB doesn’t always work. CPU fan starts to spin, so it sounds like it is a busy wait of some kind. The system is inaccessible from network as well (it may respond to pings for a few seconds after the hang but then not even that). Basically, I can only hard-reset the system at this point.

The lockups can happen at any time after boot. I’ve seen system hang within a few minutes after boot, during normal activity like browsing or watching a video, and after a few hours while the system was idle and the display was turned off due to powersaving (the system itself does not go to suspend mode).

Without the ForceCompositionPipeline option, all other config the same, lockups don’t happen. I have tested Linux kernels 4.10.17, 4.13, 4.15-rc9 - the hangs happen the same way on all of them. Nvidia driver 390.12.

Instead of /etc/X11/xorg.conf, I only have one 20-nvidia.conf file in /etc/X11/xorg.conf.d with this content:

Section "Device"
    Identifier "Default nvidia Device"
    Driver "nvidia"
    Option "NoLogo" "True"
    Option "CoolBits" "12"
    Option "TripleBuffer" "True"
    Option "MetaModes" "DP-4: nvidia-auto-select {ForceCompositionPipeline=On}, HDMI-0: nvidia-auto-select {ForceCompositionPipeline=On}"
    Option "MetaModeOrientation" "RightOf"
#    Option "UseEdidDpi" "False"
#    Option "DPI" "162 x 161"
EndSection

When I comment out MetaModes and MetaModeOrientation the lockups no longer happen.

While testing kernel 4.15-rc9, I noticed a quite few soft lockup errors in the kernel log, some of them mentioning nvidia driver in the backtraces. I’m attaching the kernel log as well.

nvidia-bug-report.log.gz (123 KB)
kern_4.15.0rc9.log.gz (1.16 MB)

I have downgraded nvidia driver to 387.34 and the system have been running for several hours without any lockups with kernel 4.13 and ForceCompositionPipeline=On. It looks like the problem is a regression in 390.12.

Hi Lastique,

Thanks for reporting this. We’re tracking it as bug 2043915 and it should be fixed in the next release.

I’m going to assume for now, that my problem is the same. I’m seeing the same symptoms, and I’m running with ForceCompositionPipeline + ForceFullCompositionPipeline aswell. My thread is this:

https://devtalk.nvidia.com/default/topic/1029097/linux/-390-12-system-hangs-with-100-cpu-unkillable-process-gtx970-/

Been testing now for a couple of hours on 390.12, without CompositionPipeline. Thus far no lockup. But I can’t stand the tearing anymore. I’ll retest when then the next release comes out.

Can you please try 390.25, which was released this morning?

I will try it as soon as it is packaged for Kubuntu.

After a few hours of testing, it looks like 390.25 fixed the problem. Thanks!

I have been using it all day without a lockup aswell. Seems fixed. Thanks.

Great, thanks for confirming!

I’m experiencing the same symptoms. It’s almost once a day now where Xorg hangs with 100% CPU usage and is unkillable. I use ForceCompositionPipeline = On because without it I get really really bad tearing everywhere.

Last time it happened I just where typing in Visual Studio Code. sshed into my PC and tried to kill Xorg: didn’t let me. Tried to reboot: ssh connection was cut, but PC didn’t reboot, so I had to hard reset.

Driver Version: 418.74
Graphics Processor: GeForce GTX 760
Operating System: Fedora 28 (will upgrade soon when I have time)