Opsu crashes on NVIDIA 535.86.10

Hi everyone,

I have an open-source game I used to enjoy playing called opsu (found at GitHub - itdelatrisu/opsu: opsu! ~ an open-source osu! client), but unfortunately, now I am no longer able to play it and I think NVIDIA drivers are most likely to be at fault since that is the only thing that would have changed. It starts up, looks like the game screen might start to display, then crashes back to desktop.

Here are my specs:
OS: Xubuntu 22.04.3 LTS x86_64
Host: X570S AORUS PRO AX -CF
Kernel: 6.2.0-26-generic
Resolution: 1920x1080, 1920x1200
DE: Xfce 4.16
WM: Xfwm4
WM Theme: Default
CPU: AMD Ryzen 9 5900X (24) @ 4.001GHz
GPU: NVIDIA RTX 4070
GPU Driver: 535.86.10
Memory: 31980MiB

Here is a crashlog from the game, which to me indicates that NVIDIA drivers could actually be at fault:
hs_err_pid2262039.log (128.6 KB)

The main error that appears in the command line is basically:
# Java VM: OpenJDK 64-Bit Server VM (11.0.20+8-post-Ubuntu-1ubuntu122.04, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
# Problematic frame:
# C [libc.so.6+0x97f74] pthread_mutex_lock+0x4

Does anyone have any ideas for any additional logging or things I could check for this issue? I feel pretty confident this is an NVIDIA driver issue.

Confirmed, this is a regression with NVIDIA driver 535. The game Opsu works on Ubuntu 22.04 after uninstalling NVIDIA 535 and installing 525.125.06 with full performance on my NVIDIA RTX 4070, the game starts crashing again on launch with a black screen crash to desktop after I uninstall 525 and install 535.104.05.

I ended up manually installing 530.41.03 drivers again and I had to manually hold them since apt tries to “upgrade” them to 535, which is not an upgrade IMO but a downgrade.

I downloaded the driver .deb files manually from here: amd64 build of nvidia-graphics-drivers-530 530.41.03-0ubuntu0.22... : PPA for Canonical Kernel Team : “Canonical Kernel Team” team and from here: i386 build of nvidia-graphics-drivers-530 530.41.03-0ubuntu0.22.... : PPA for Canonical Kernel Team : “Canonical Kernel Team” team

Here is the command I used for manually installing the 530.41.03 drivers:
sudo apt install ./libnvidia-compute-530_530.41.03-0ubuntu0.22.04.2_i386.deb ./libnvidia-fbc1-530_530.41.03-0ubuntu0.22.04.2_i386.deb ./libnvidia-extra-530_530.41.03-0ubuntu0.22.04.2_i386.deb ./libnvidia-encode-530_530.41.03-0ubuntu0.22.04.2_i386.deb ./libnvidia-decode-530_530.41.03-0ubuntu0.22.04.2_i386.deb ./libnvidia-gl-530_530.41.03-0ubuntu0.22.04.2_i386.deb ./nvidia-kernel-source-530_530.41.03-0ubuntu0.22.04.2_amd64.deb ./nvidia-kernel-common-530_530.41.03-0ubuntu0.22.04.2_amd64.deb ./xserver-xorg-video-nvidia-530_530.41.03-0ubuntu0.22.04.2_amd64.deb ./nvidia-utils-530_530.41.03-0ubuntu0.22.04.2_amd64.deb ./nvidia-driver-530_530.41.03-0ubuntu0.22.04.2_amd64.deb ./libnvidia-gl-530_530.41.03-0ubuntu0.22.04.2_amd64.deb ./nvidia-dkms-530_530.41.03-0ubuntu0.22.04.2_amd64.deb ./nvidia-compute-utils-530_530.41.03-0ubuntu0.22.04.2_amd64.deb ./libnvidia-fbc1-530_530.41.03-0ubuntu0.22.04.2_amd64.deb ./libnvidia-extra-530_530.41.03-0ubuntu0.22.04.2_amd64.deb ./libnvidia-compute-530_530.41.03-0ubuntu0.22.04.2_amd64.deb ./libnvidia-encode-530_530.41.03-0ubuntu0.22.04.2_amd64.deb ./libnvidia-decode-530_530.41.03-0ubuntu0.22.04.2_amd64.deb ./libnvidia-common-530_530.41.03-0ubuntu0.22.04.2_all.deb ./libnvidia-cfg1-530_530.41.03-0ubuntu0.22.04.2_amd64.deb

After that, I had to mark them as held so apt wouldn’t keep trying to upgrade them to 535:

sudo apt-mark hold libnvidia-common-530 libnvidia-compute-530:i386 libnvidia-decode-530:i386 libnvidia-encode-530:i386 libnvidia-extra-530:i386 libnvidia-fbc1-530:i386 nvidia-dkms-530 nvidia-driver-530

Staying on these older 530 drivers works fine for me, and it gives me better performance with 530.41.03 drivers versus 525.125.06 drivers for my RTX 4070:
525 benchmark: https://i.imgur.com/cMAfyFN.png
530 benchmark: https://i.imgur.com/vGz8mAm.png

This is probably gonna be my last post for a while in here unless others start posting here since I basically have my workaround, which is good enough for me for now.

Unfortunately, I’m back to this issue again because my Ubuntu 22.04 upgraded to Kernel 6.5.0-14 which it seems that the Kernel 6.5 series does not work with the NVIDIA driver 530.41.03, and fails to compile DKMS modules:

sudo dkms autoinstall

Kernel preparation unnecessary for this kernel. Skipping...

Building module:
cleaning build area...
unset ARCH; [ ! -h /usr/bin/cc ] && export CC=/usr/bin/gcc; env NV_VERBOSE=1 'make' -j16 NV_EXCLUDE_BUILD_MODULES='' KERNEL_UNAME=6.5.0-14-generic IGNORE_XEN_PRESENCE=1 IGNORE_CC_MISMATCH=1 SYSSRC=/lib/modules/6.5.0-14-generic/build LD=/usr/bin/ld.bfd CONFIG_X86_KERNEL_IBT= modules....(bad exit status: 2)
Error! Bad return status for module build on kernel: 6.5.0-14-generic (x86_64)
Consult /var/lib/dkms/nvidia/530.41.03/build/make.log for more information.


Kernel preparation unnecessary for this kernel. Skipping...

Here is some of the output from the end of the make.log file:
make.log (59.6 KB)

I tried both NVIDIA 535 (Production Branch) and NVIDIA 545 (New Feature Branch) using the NVIDIA driver versions available in my apt repositories: 535.129.03-0ubuntu1 and 545.23.08-0ubuntu1 but I still get the same crash as the original post when trying to run Opsu with either of those two drivers.

I am sticking with the Production Branch NVIDIA 535 drivers for now, even though it doesn’t work with opsu, hopefully this issue can be resolved eventually.

I’m still experiencing this opsu crashing issue on NVIDIA 535.161.07, does anyone have any ideas of how I can get this issue looked into by NVIDIA, or any ideas for workarounds, or any extra logging I can provide to help with this?

I guess you should enable complete core dump so it’s included in the crash log. 'Please also provide an nvidia-bug-report.log taken after the crash.

Here is the full error I have received:

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x0000790df9097ef4, pid=569494, tid=569495
#
# JRE version: OpenJDK Runtime Environment (11.0.22+7) (build 11.0.22+7-post-Ubuntu-0ubuntu222.04.1)
# Java VM: OpenJDK 64-Bit Server VM (11.0.22+7-post-Ubuntu-0ubuntu222.04.1, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
# Problematic frame:
# C  [libc.so.6+0x97ef4]  pthread_mutex_lock+0x4
#
# Core dump will be written. Default location: core.569494
#
# An error report file with more information is saved as:
# hs_err_pid569494.log
#
# If you would like to submit a bug report, please visit:
#   https://bugs.launchpad.net/ubuntu/+source/openjdk-lts
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#
[1]    569494 IOT instruction (core dumped)  /usr/lib/jvm/java-1.11.0-openjdk-amd64/bin/java -jar

The core dump is 1GB, I’ve 7zipped it to 60mb, you can rename it to .7z to extract it.

core.569494.zip (69.1 MB)

I ran opsu with core dumping enabled (ulimit -c unlimited) then I ran sudo nvidia-bug-report.sh to generate the nvidia-bug-report log:
nvidia-bug-report.log (7.2 MB)

If there’s any other information that might help, or if you have any ideas for workarounds, please let me know. Thank you for taking the time to respond.

Please load the core dump into gdb
gdb -c core.569494
and then use bt to display the backtrace. Please c/p that into your post.

Not sure how useful this is, it doesn’t look too useful to me:

 gdb -c core.569494 
GNU gdb (Ubuntu 12.1-0ubuntu1~22.04) 12.1
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word".

warning: Can't open file /memfd:/.glXXXXXX (deleted) during file-backed mapping note processing

warning: Can't open file /memfd:/.nvidia_drv.XXXXXX (deleted) during file-backed mapping note processing

warning: Can't open file /tmp/hsperfdata_user/569494 (deleted) during file-backed mapping note processing
[New LWP 569495]
[New LWP 569496]
[New LWP 569502]
[New LWP 569503]
[New LWP 569507]
[New LWP 569520]
[New LWP 569522]
[New LWP 569519]
[New LWP 569523]
[New LWP 569525]
[New LWP 569494]
[New LWP 569498]
[New LWP 569501]
[New LWP 569532]
[New LWP 569500]
[New LWP 569497]
[New LWP 569499]
[New LWP 569505]
[New LWP 569504]
[New LWP 569539]
[New LWP 569506]
[New LWP 569510]
[New LWP 569515]
[New LWP 569508]
[New LWP 569536]
[New LWP 569529]
[New LWP 569517]
[New LWP 569528]
[New LWP 569526]
[New LWP 569518]
[New LWP 569511]
[New LWP 569521]
[New LWP 569530]
[New LWP 569527]
[New LWP 569514]
[New LWP 569524]
[New LWP 569509]
[New LWP 569531]
[New LWP 569535]

warning: Section `.reg-xstate/569495' in core file too small.
--Type <RET> for more, q to quit, c to continue without paging--bt
Core was generated by `/usr/lib/jvm/java-1.11.0-openjdk-amd64/bin/java -jar '.
Program terminated with signal SIGABRT, Aborted.

warning: Section `.reg-xstate/569495' in core file too small.
#0  0x0000790df90969fc in ?? ()
[Current thread is 1 (LWP 569495)]
(gdb) bt
#0  0x0000790df90969fc in ?? ()
#1  0x0000790df7bfc0b0 in ?? ()
#2  0x0000003000000018 in ?? ()
#3  0x0000790df7bfc180 in ?? ()
#4  0x0000790df91148d4 in ?? ()
#5  0x000000000003c0a5 in ?? ()
#6  0x00000000000000c2 in ?? ()
#7  0x0000790df8ec8d20 in ?? ()
#8  0x00000000000000c2 in ?? ()
#9  0x0000790df8bba9e0 in ?? ()
#10 0x0000790df88045aa in ?? ()
#11 0x0000790df8ec8d20 in ?? ()
#12 0x00000000000000c1 in ?? ()
#13 0x0000790df7bfc320 in ?? ()
#14 0x0000790df8bba9e0 in ?? ()
#15 0x0000790df7bfc210 in ?? ()
#16 0x0000790df91148d4 in ?? ()
#17 0x000000000003d660 in ?? ()
#18 0x373901a879df4900 in ?? ()
#19 0x0000790df7bff640 in ?? ()
#20 0x0000000000000006 in ?? ()
#21 0x0000790df7bfc2e0 in ?? ()
#22 0x0000790df8ec8cf8 in ?? ()
#23 0x0000790df7bfc320 in ?? ()
#24 0x0000790df9042476 in ?? ()
#25 0x0000790df921be90 in ?? ()
#26 0x0000790df90287f3 in ?? ()
#27 0x0000000000000020 in ?? ()
#28 0x0000790df8805515 in ?? ()
#29 0x0000790df7bfc320 in ?? ()
#30 0x00000000000007d0 in ?? ()
#31 0x0000790df7bfc320 in ?? ()
#32 0x0000003000000010 in ?? ()
#33 0x0000790df7bfc280 in ?? ()
#34 0x0000790df7bfc190 in ?? ()
#35 0x000000000003d660 in ?? ()
#36 0x00000000000007d0 in ?? ()
#37 0x0000790df8b5c256 in ?? ()
#38 0x000000000000000a in ?? ()
#39 0x0000000000000000 in ?? ()

Anything else that might be helpful?

I guess you would need a java expert to debug this. I have no idea where it’s crashing and what it was trying to do.

Well the issue is, while the game is an older open-source re-implementation of a closed-source game, it was written with a game library that has since been discontinued, so that doesn’t help.

However, it’s definitely a regression with the NVIDIA drivers specifically, since it was working perfectly for me on 530.41.03 (which now no longer works for me since I’m on Linux kernel 6.5, and even if it did work it’s no longer secure IIRC, so it would need to be built for kernel 6.5 AND have security fixes backported) but it just doesn’t work on 535 and above.

I’m hoping NVIDIA can chime in on this, and look into it, but I understand it’s a niche game so I’m not super hopeful about it. This is my one and only issue with Linux right now, unfortunately.

Please check if it runs with __GL_THREADED_OPTIMIZATIONS=0 env variable set.

1 Like

OMG it literally works with that env variable!!! How did you think of that, and why does it work? I’d love to hear the technical details about why that would work since I’m a developer and love hearing about technical details.

Lots of speculation involved. The native function it was crashing in was
C [libc.so.6+0x97f74] pthread_mutex_lock+0x4
so it was about threading, using pthreads
The java frame was in
sun.awt.X11.XlibWrapper.XGetDefault
so while calling Xlib
Xlib+pthreads+nvidia driver=threaded optimizations.
In conclusion, this means the bug likely is not in the nvidia driver but in the application, not using threading correctly/safely. I suspect it was working before with the nvidia driver only by chance but changes in the nvidia driver regarding threaded optimizations changed timing, now triggering the bug. So turning off threaded optimizations is a valid workaround.

1 Like

That makes sense, nice find. Definitely seems like it could be a race condition since it has to do with threading like you said, plus it’s an older application too so it makes sense that threading support was likely not as strong/not properly implemented like it is these days. Thanks again. Hopefully this game never breaks again in future updates :)