System hangs frequently on Linux 5.4, 5.5, 5.6 while playing yuzu in OpenGL

My computer hangs frequently while playing games in the yuzu emulator using the OpenGL backend. Stutters cause the display to periodically freeze and input devices to drop events. When running glxgears in the background, it drops frames during the stutter. yuzu has a Vulkan backend that is not affected by this issue. No other OpenGL application that I have causes this issue, and the issue is not observed in Windows 10.

Tested on multiple kernels in Manjaro: the stutters are reproducible on Linux 5.4, 5.5, 5.6, and 5.6.12-24-tkg-pds, while using 4.19 does not cause stutters. Tested on NVIDIA drivers 390.132, 440.66.11, and 440.82, all cause stutters on 5.6. Desktop environment and compositor on/off seems to not influence stuttering. Stutters are present on my GTX 750 Ti, GTX 970, and GTX 980 Ti, and in my conversation with other yuzu users, the issue is reproducible on an RTX 2080 Ti in Arch Linux.

In my efforts to get the bug report log, running startx -- -verbose 6 with no display-related xorg.conf files causes the X session to use my second monitor as the primary monitor, and the stutters are not observed even on recent kernel versions. But if I apply any conf generated by nvidia-settings, the stutters come back. In addition, when disabling the first monitor in my desktop environment and restarting X, stutters still happen regardless of conf files.

inxi -F -v8 output here: http://ix.io/2mlF
nvidia bug report log gzipped: nvidia-bug-report.log (434.8 KB)
apitrace that reproduces the issue (750 MB compressed, 3.4 GB uncompressed): https://drive.google.com/open?id=1assSjTpRDMNZRS0X6xQKSqYkwY56JebF

Using the apitrace, the most consistent way to get stutters is when it’s unloading. Otherwise, it doesn’t seem to stutter much during the replay (at least 2 stutters), and run-to-run stutters are inconsistent during the replay. It was captured on Linux 4.19.121, thus no stutters were present during the capture.

Is there a better place to ask this? Three days without any feedback whatsoever makes me feel like this isn’t the right place.

It might be the correct place to ask, sometimes there just are no answers.
At least I don’t know about any performance regressions regarding kernel versions 5.4-5.6. Since this is only affecting exactly one application, I’d wonder if that’s relying on something special.
Furthermore, your setup is quite complex, one 60Hz monitor, one 144Hz monitor, which one is the sync device, which one the display? Does it depend on nvidia-drm.modesetting=1? Too many variables to debug anything. The most minimal setup that can reproduce issues is required.
For a quick shot, you could try starting yuzu with

__GL_MaxFramesAllowed=1

env variable set.

…I’d wonder if that’s relying on something special.

Likely, yuzu is in very active development right now and at times uses features as they come out. (A few days ago a Vulkan feature was released in the beta driver, and an hour later yuzu was updated to take advantage of it.)

Sorry that some of this comes as I remember it, but I’ve tried a few things here and there that don’t work. I’ll make a list:

  • Fiddling in nvidia-settings:
    – Force full composition pipeline and/or Force composition pipeline
    – Enabling/disabling vsync, or choosing either monitor as the sync monitor
    – Enabling/disabling G-Sync
    – Enabling/disabling Use Conformant Texture Clamping
    – Setting Image Settings
  • Manually setting the power state
  • Using only the 144Hz monitor, or only the 60Hz monitor, or using the 144Hz monitor alone in 60Hz
  • Forcing PCIe Gen 3.0 in modprobe, and switching x8 or x16
  • Enabling NVreg_UsePageAttributeTable and/or NVreg_EnableMSI in modprobe
  • Installing Manjaro (and other distros) to a different drive, and setting as few settings/installing as few packages as needed to get into yuzu

__GL_MaxFramesAllowed=1 didn’t fix the issue. Next time I post, I’ll have a bug report from another fresh Manjaro installation on the drive just mentioned, as well as results from testing modesetting. So far only Debian Buster works well out of the box. It uses 4.19, which led me on the trail that the kernel version is related to the bug.

To have a most minimal system

  • remove xorg.conf and all config files setting special options
  • have only a 60Hz monitor connected
  • set nvidia-drm modeset=0
sudo cat /sys/module/nvidia_drm/parameters/modeset

should return ‘N’ if done right.

I followed your instructions by removing everything but 00-keyboard.conf in /etc/X11/xorg.conf.d/, setting only options nvidia-drm modeset=0 in /etc/modprobe.d/nvidia.conf, creating a new user account to make sure no user-space configs got in the way, and had the 60Hz monitor alone connected. The command you suggested returned ‘N’ as it was supposed to.

The issue still persists. nvidia-bug-report.log.gz (468.2 KB) This is NOT a new install of Manjaro, instead this is the old install on a new user account and settings removed.

There was a segfault:

yuzu:GPU[5018]: segfault at 55c244a533b0 ip 000055c244a533b0 sp 00007f9eae7fb8d8 error 15

Was that related to the issue or just a coincidence?

Did you try to reproduce the issue with the apitrace? I’m not very familiar with apitrace, but if it recorded a segfault at the very end of the trace, that’s a separate Linux-specific issue in yuzu. I ended the trace with Link running out of the pond looking northeast-ish.

Also I forgot to mention that you have to do apitrace replay -b yuzu.trace to replay it. Link’s clothing will be black, but that’s an issue with apitrace I believe.

Yesterday I found a user who tested on a GTX 1060 6GB and an RTX 2080. On their B350/2700X system they had this issue solved with using 4.19, and I’m asking them to test on their 4790k. The first user I mentioned is using a 2950X. I’m on a X470/2700X computer, so it’s possibly a Zen+ issue.

One of the other yuzu Linux users found this commit to be the culprit:

I tested this by building Linux 5.4.0-rc1 with and without that commit, and the stutters disappear with the commit removed. This is in all Linux versions after and including 5.4.0-rc1.

Since the other yuzu user I mentioned is in contact with someone else at Nvidia about this issue, I’m going to drop the topic. Thanks for the help debugging the issue, generix.