Random freezing when connected to three monitors (GTX 750Ti, driver 390.77/396.51, Linux Mint 18.3/19)

I am using a GTX 750Ti and driver version 390.77 straight from the graphics-drivers ppa on Linux Mint 18.3, and all three outputs (HDMI, DVI, and DP) are connected to one monitor each. The system works just fine, I can log in and it displays the full desktop, and everything appears to work, but eventually, the entire screen will freeze forever, no mouse movement or keyboard input, and I have to do a hard restart of my system. The “randomness” comes in the timing - sometimes, this happens ten seconds after I get to the desktop, and sometimes, it happens after three hours of uptime. Sometimes, it doesn’t happen at all. I cannot seem to find any pattern whatsoever. And as soon as I disconnect one of the monitors, it stops happening. It only ever happens with all three monitors connected.

Please let me know if I need to provide any further information.
Thank you for any help.
nvidia-bug-report.log.gz (129 KB)

The log doesn’t show any anomaly. See if you can ssh into the system while frozen to create a nvidia-bug-report.log that contains the actual error.

Apologies, it took a while to reproduce the freeze and find a second machine to connect with, but attached is a log file generated while the system was frozen.
nvidia-bug-report.log.gz (99 KB)

Unfortunately, no gpu or driver related errors are logged, gpu is alive and kicking. So looks rather like some Xorg/DE related issue. Anything in journal while frozen?
Does the issue vanish independent of which monitor you remove?
Since you’re still on Xorg 1.18, maybe try upgrading to 1.19
https://wiki.ubuntu.com/Kernel/LTSEnablementStack

I will check the journal when it happens again.

For what it’s worth: As far as I can tell, reverting the driver to 384.130 also resolves the issue. At least it hasn’t frozen once since I downgraded.

Was able to reproduce with driver version 396.51; Log attached. After half a month of use, I can say with reasonable certainty that 384.130 does not have the issue.

When comparing the journal to a boot with only two monitors (without the freeze), I noticed that right before the freeze happened, this appeared, but that sounds like it was just flatpak getting the data for the new driver I tried. Nothing else was found that wasn’t in the normal log.

Aug 17 12:31:10 Fusewood flatpak[2424]: libostree pull from 'flathub' for runtime/org.freedesktop.Platform.GL.nvidia-396-51/x86_64/1.4 complete
                                        security: GPG: summary+commit http: TLS
                                        non-delta: meta: 2 content: 0
                                        transfer: secs: 4 size: 1,9 kB
Aug 17 12:31:11 Fusewood flatpak[2424]: libostree pull from 'flathub' for runtime/org.freedesktop.Platform.GL.nvidia-396-51/x86_64/1.4 complete
                                        security: GPG: summary+commit http: TLS
                                        delta: parts: 1 loose: 1
                                        transfer: secs: 1 size: 355,3 kB

nvidia-bug-report.log.gz (97.6 KB)

Can also confirm it still occurs after having upgraded to Linux Mint 19, including Xorg 1.19. Interestingly, the journal produced something that might be related this time, see attached snippet. And, as always, disabling one of the monitors fixes it completely.
journal-snippet.txt (2.88 KB)
nvidia-bug-report.log.gz (100 KB)

I noticed that Xorg seems to be using 100% of a CPU thread after the freeze occurs (spotted in htop through SSH), with no way to stop it. It seems to be completely blocked, which would explain the behaviour.

Looks like the nvidia modesetting module runs into a deadlock so an unkillable Xorg at 100% cpu is the consequence.
Did you already try to use kernel parameter
nvidia-drm.modeset=1
and see if it works around the issue?

I have not tried it yet. Do I need to do anything other than add that option to the boot options in grub to enable it? The various online sources are somewhat contradicting one another.

If adding the option to grub is all I would need to do, it didn’t help; the freezes are still happening, see attached files.
journal-snippet.txt (4.3 KB)
nvidia-bug-report.log.gz (110 KB)

Adding it to grub was sufficient but it didn’t help, was worth a try.

It seems that turning on ForceCompositionPipeline and ForceFullCompositionPipeline bypassed this issue for me - at least there hasn’t been a freeze in over 30 hours. Which of the two causes the freeze to not happen anymore remains to be seen.

That sheds some light on it. The composition pipeline has been reworked, to return to the old one on failures there’s the compatibility option
Option “UseNvKmsCompositionPipeline” “false”

I removed ForceCompositionPipeline and ForceFullCompositionPipeline from my xorg.conf and added the UseNvKmsCompositionPipeline option set to false, and it does indeed seem to also bypass the issue; so far, there has not been a single freeze, although it has only been a day.

You misunderstood, UseNvKmsCompositionPipeline is an additional option to ForceFullCompositionPipeline to switch between implementations.
So removing ForceFullCompositionPipeline fixes issue,
now see if
having ForceFullCompositionPipeline + UseNvKmsCompositionPipeline fixes it, too.

With “UseNvKmsCompositionPipeline” set to “false” and ForceFullCompositionPipeline turned on for all monitors, there has also not been a freeze in 30 hours, so this does seem to work as well. Attached my xorg.conf for reference.
xorg.conf.gz (821 Bytes)