Dual G-Sync displays and 2080 Ti rather buggy and frustrating experience with latest drivers

OS: Ubuntu 18.04
Driver version: 430.14
GPU: MSI 2080 Ti Seahawk EK X
Monitors: Dual 32" 4k Acer Predators (XB321HK)

I try to work and play on the same Linux box. Work under Linux and Nvidia seems to favor dual displays for me and the composition pipeline to combat screen tearing as v-sync does not seem to do anything in regular desktop mode. Gaming with Nvidia’s drivers favors one G-sync display with the composition pipeline turned off as dual monitors and G-sync is not supported at all, plus do you really want the extra latency of another frame buffer when you could be using G-sync? The composition pipeline tends to hang when one display display disappears from the configuration. Using a KVM to switch one display, keyboard, and mouse to another machine with DisplayPort causes one display to disappear. Switching things around, say with the KVM and also say with different users on the same system (this is a home computer and multiple people live here) causes G-sync to stop working apparently (or else it just randomly stops working) and then nothing makes it work until logging out and back in, but doing work with lots of things open, constantly closing everything down is not an option. When G-sync is not working, v-sync does not prevent persistent screen tearing. It is more like slip-sync as it will slip out of sync and the tear line usually stays around the same spot. The composition pipeline with gaming and v-sync not syncing leads to frame jutter (as in repeat a frame and then skip a frame and keep going back and forth between skipping and repeating in rapid succession). Steam Proton gaming without g-sync causes a frame skip once a second every second.

All this constant flipping around, changing modes, things breaking, etc gets rather maddening. While there is some basic ‘works’ here, these driver issues are making the whole deal a death by a thousand pin pricks. Especially seeing Linux is becoming a lot more popular for gaming and has a strong work audience and a strong developer presence, we just don’t have access to fix the proprietary drivers ourselves (I am a software engineer and I have fixed up and recompiled a number of applications under Linux), it should be bring up the system, configure the drivers once, and then everything just works; no turning the composition pipeline on and off to deal with screen tearing, no screen freezes up because a monitor was either turned off or switched to another computer from a KVM, no having to disable a display because I want to play a video game, especially with G-sync capable displays and an Nvidia card, no screen tearing while playing a video game, and no G-sync refusing to work when everything is setup so as to make it happy, requiring a logout / reboot.

nvidia-bug-report.log.gz (913 KB)

I am up to the latest 430.26 drivers and I lost G-sync just a few days after updating and rebooting. I get a message with NVIDIA X Server Settings saying the configuration in the control panel is out of sync with the actual settings and it asks me to reload. After this no more G-sync. G-sync is really key to just get things to look right at all under Linux due to various problems with these drivers, so especially on a system that is always up and lots of windows open all the time, this is rather debilitating.

I can only get G-Sync to work if I disable the second screen from the system panel. It is not enough to just turn off the screen; you really need to be running X with only a single screen.

BTW: there is also major issue with power saving when you are using two 4K monitors, your GPU will always be on max power state, drawing a lot of power and generating excess heat. The only fix used to be to disable the second screen, but I found a workaround by changing the resolution of the primary screen to 1028x768 and back.

I have tried getting G-sync back multiple ways including what you mentioned. No dice. The only way that has worked before is logging out and back in (which includes rebooting). I wouldn’t be surprised if I only had one display and especially if I did not have a KVM, it would work all the time instead of maybe a few days to a couple of weeks when I am in single monitor mode, but when you have these things and make heavy use of them, it is hard to say just rip it out completely because there are problems with Nvidia’s Linux drivers.

I have noticed the power draw listed with the nvidia-smi tool never reports below 64W on my 2080 Ti. I wasn’t sure how much of this is dual monitors use more power (I have had problems with this in the past) vs I have this system doing multiple real time tasks that make use of the video card and so it never goes to idle. I am at least using liquid cooling to a thick 560mm radiator along with a monoblock for the motherboard, so it stays cool. The problem with one of the previous cards in the direct predecessor to this system is the card ran real hot all the time due to dual displays, so I wrote a Python script to increase the fan speed (regulate depending on load, but regulate high) to help compensate, but still at some point the card went out with a bang that my next door neighbor mistook for gunfire when a power cap on the card blew and then the system hard downed, probably because of overcurrent protection. (A definite improvement over video cards of the distant past where I would see the old lead based solder melt and ooze across the video card causing a cascading short circuit and a burnt trace going to the motherboard and splatter marks all over the board from popped caps and little chunks of ceramic material with tiny wires hanging out littering the case, but the system would stay on.)

I don’t like to drop the resolution down and then bringing it back up because I have a lot of windows open on this system all of the time and they get really screwed up when I do that and then have have to get everything back in order. Just not worth it to me.

It could be KVM as I’m not running any virtual machines on my computer.

2080 Ti should be idling around 15-20W, not around 64W, which means that your card never goes to lower power states (or switch out from P0).

Try running:

while sleep 1 ; do ( date ; nvidia-smi | egrep W.+W ; sudo nvidia-smi -q | grep -A4 " Clocks$" ) | tee -a /tmp/out ; done

That is a driver bug for sure. :) I was able to get the power use down by 4x just by changing the screen resolution down and up after a reboot – after that, it will be fine even if you suspend the computer. And yes, I have always 4-6 screens worth of windows on my virtual screens, too, so I had to take a weekend to try to fix my issues.

Unfortunately I cannot help with the G-Sync issue; it was tricky to get working properly even without KVM running.

For a little more clarification I meant “a KVM” as in a keyboard / video / mouse switch. For virtualization I am using VMware as a holdover from long ago as VMware was the first virtualization solution in this category to really work for me and I have not gotten around to redoing my VMs for another virtualization platform. Specifically the KVMs I am talking about are two Startech SV431DPUA2 KVMs as these can handle 4k displays at 60Hz. At this I am regularly alternating between either having both on the same computer or each on a different computer, which is why I have two instead of one dual monitor one. When I switch one to select another computer, the monitor drops out of the config on the Linux box in question. While this is different behavior than my old DVI setup (where once recognized, it never went away), it does have its uses such as all of my windows are one one screen when I have one of the KVMs switched to another computer. Actually because of this behavior I decided to stick to two identical displays as opposed to say one 2.5k 144Hz screen and one 4k 60Hz screen as the initial test of one of these 4k displays with an old 2k monitor was rather messy and annoying, plus G-sync really broke in that configuration, so a second identical display was picked in part to try to minimize pissing off G-sync and such.

Also I have multiple real time tasks processing on the 2080 Ti around the clock as in as soon as the computer gets to that service in the boot sequence, it is running and processing, so this card will never actually get to idle where I am at a point to monitor power usage unless of course I manually disable those processes. Then again those processes are doing useful things around the clock and other people are using them, so I really don’t like to disable them if I don’t have to.

If you are curious, I grabbed information from your suggested command piecemeal and here is what seems to be a standard sample from baseload utilization:

  1. nvidia-smi | egrep W.+W:
    | 0% 28C P2 66W / 330W | 5639MiB / 11016MiB | 13% Default |

  2. nvidia-smi -q | grep -A4 " Clocks$":
    Graphics : 1350 MHz
    SM : 1350 MHz
    Memory : 6800 MHz
    Video : 1245 MHz
    Applications Clocks
    Graphics : N/A
    Memory : N/A
    Default Applications Clocks
    Graphics : N/A
    Memory : N/A
    Max Clocks
    Graphics : 2190 MHz
    SM : 2190 MHz
    Memory : 7000 MHz
    Video : 1950 MHz
    Max Customer Boost Clocks
    Graphics : N/A
    Clock Policy
    Auto Boost : N/A
    Auto Boost Default : N/A

Edit: Actually one of the reasons I am trying to get better Linux Nvidia support is I am trying to collapse my setup down to one main computer, no extra costly boxes and no KVMs. Linux does everything else pretty good; just need Nvidia and a small handful of gaming houses to fix their stuff up a little more. I suppose technically I also have a Mac in this configuration, but it is pretty worthless because Apple has shunned people like me and even if I did want to run more Apple stuff, it seems to work better in a VM in VMware than it does on Apple’s physical hardware and is definitely a lot cheaper to run in a VM.

Oh, I’ve not been using those KVM switches for 10 years; right now I have a setup where I have two monitors in my primary computer and yet another 27" 4K monitor in my second one. And if I really need dual monitors, I just use the extra HDMI port and do the switch from my monitor.

I do not own two XB321HK, but I’ve used:

32" Acer XB321HK (4K, primary) + 32" Samsung U32H850 (4K, freesync off) - now
32" Acer XB321HK (4K, primary) + 27" Asus PG27AQ (4K, secondary) - one year ago
27" Asus PG27AQ (4K, primary) + 27" Eizo FlexScan 2736W (2K, no G-sync) - couple of years ago

I’ve never had any issue with G-Sync with any of the above combinations – other than it never worked on dual screen setup, though that said, I’ve never owned two of the same monitors. I did have a lot of issues with my first 4K (and less so with my second) monitor, though, mostly because of nasty flickering, which was likely caused by driver issue as the same issue was present in 3 different graphics cards (GTX 970, 1080 and 1080 Ti). The issue is no more there, so…

As I said, G-Sync has always been working for me, though unfortunately I’ve always had to disable the secondary screen to make it to work; even with two similarily spec’ed G-Sync monitors. It also works only in full screen and not in window mode, even if the game used the whole screen area.

Oh, BTW: if I turn off or change input source from my secondary monitor, it also gets removed from X server configuration (and the windows move to the primary screen), but luckily the state will be restored when I turn it back on. I think this is somewhat new feature in X server itself as I cannot remember it happening couple of years ago. It does happen with all monitors and IMHO in Intel integrated as well.

If you don’t mind providing some more contrast to see what sticks out and does not stick out as different between how we are doing things in the hopes of getting a better idea where the problem may lie, I have the following questions:

  1. How long is your system up for continuously with you logged in? For me on the system in question it is up and running around the clock and is rebooted once every few weeks to a couple of months for patches and physical maintenance. It is usually days to weeks into running continuously when the problem occurs for me.

  2. Do you have any services running continuously that utilize the GPU? I am running Xeoma with several security camera feeds with Xeoma using the ‘preview’ streams to look for motion and Plex with multiple OTA TV tuners, which are both accessing the GPU through CUDA for video processing. (The 2080 Ti is an especially good card for this as it has a great deal of hardware encoder and decoder capability accessible in a multitasking way through CUDA and this hardware is separate from the processing pipelines used for gaming unless of course you are streaming that gaming video.)

  3. Do you have other users logged into the console of your system? I have other people using this computer with their own accounts and we just switch between which user is active without logging out.

In general, one of the goals of this system is reductive hardware for multiple tasks. For one I cannot afford a large pile of computers each dedicated to its own task. Second I don’t have the physical space for lots of different things, so I need to group up as much as possible. Third having a bunch of lower powered machines eats up more power than one big system with all of these duplicated core components doing the same tasks. Anyways Linux has always been a good multi-tasking performer, a lot better than what I have been able to do with Windows. So I am really kind of forced onto Linux as my primary platform as Windows just splits apart at the seams whenever I try to load it up (and I have gotten Windows to the BSOD on boot point of overloaded many times) and Apple under Tim Cook shuns people like me as in just won’t allow the Apple platform to do the things I need it to do and shuts down the developers trying to make this happen. I am very close to having one physical box do the full gamut of deskside stuff I need it to do with Linux. These last few problems just need to be fixed.

Here are my answers:

  1. My card is GTX 1080 Ti, though I also had 1080 with the same issues.

  2. The computer is always on and rebooted only when needed (kernel updates, GFX driver updates, hardware maintenance).

  3. No services on GPU and the only GPU heavy task is gaming – every few weeks. And good to know, thanks for the info. :)

  4. Not really except for myself. I can be logged in from multiple sources, sometimes forwarding X application.

Maybe it is the CUDA heavy load that messes up with G-Sync, though that said, I’ve never been able to make G-Sync work unless I’m in the full-screen mode (not in windowed mode) and only if I disable the second monitor. Because of that, I’m forced to close almost all my running X apps, which is total pain because there are so many of them in my virtual screens. Which is partially why I don’t ever play games (not that I have too much time for it either).

I always need to drop to one screen to make G-sync work when it does work, plus can only use G-sync in full screen mode. I gathered this from the Phoronix article I read before trying this. I don’t really understand the closing of X apps though. All of my stuff just moves over to one screen and then back when I re-enable it. Could it also be I just have so many X windows open all the time?

Maybe something else to consider is the first of these screens I bought, it looked like from reviews the buggy batches had all been sold off and the newer ones were looking good, however it turns out the particular seller through Amazon had an old one sitting in their warehouse for a really long time (there is no way to know ahead of time how long the seller has had it sitting on the shelf for) and so the older of the two displays is a bit on the glitchy side. This is as in sometimes flickers with red lines running vertically along the display, sometimes is only seen as a 30Hz monitor, so I have to switch inputs on the KVM and switch it back it get it to re-recognize as a 60Hz capable display. Also at one point the display would not turn on even after unplugging and replugging and stuff like that, so I was hoping this was the chance to RMA it and then have both newer displays. However after the broken Acer tech support nonsense followed by them contacting the seller because they made the monitor so long ago and didn’t want to believe I had just bought it, they managed to provide instructions that got it to “hard reset” so that it would turn on again, so I am stuck with it for now. (It seems kind of crazy that you would need to “hard reset” a monitor like how you go about clearing your CMOS or something just to get it to turn on, but that is what they had me do.) So I don’t think I can discount one monitor going screwy and causing the Nvidia drivers to foul up and lose G-sync support. At least the newer of the two displays has been glitch free as far as I can tell, so Acer did eventually fix the problems with this display or so it seems.

Maybe what I am really asking for is more resilient G-sync support so when something goes weird, G-sync support will reset and re-enable itself. It would also really help if when the force composition pipeline is enabled, it could gracefully handle going from two screens to one as right now the one remaining screen basically freezes. With these two things fixed up some, I could at least have a good ‘work’ mode and then be able to switch to a ‘game’ mode when needed instead of work and play both being screwed up.

Mostly I’m closing other apps because I am only using G-Sync when gaming on something that may run on less than 60 FPS and needs all the power it can get. I remember having some issues if I didn’t do that.

I also had the flicker issue in my Acer screen (which was also there with my Asus screen), either having vertical lines or the whole screen going to a single color for a single frame. I know this because I was able to record the issue with my phone. The issue was present on my older 970 graphics card why I bought the new card. But the issues remained.

Fortunately, I was able to mostly fix the issue by changing the DisplayPort cable and with later drivers and I almost never get the glitches anymore.

I agree that not being able to use G-sync if you have 2 screens is nonsense, it should at least be able to turn off the second screen while gaming so I didn’t need to go to the display setting all the time. I know I could also write a script to do that, but… why?