Nvidia 495.44 crash in drivers on exit for EGL applications

libEGL_nvidia.so.0 appears to install an exit handler which can fail for some applications and possibly result in kernel panics.

On archlinux with the nvidia-utils 495.44-5 package which appears to contain the top level library blamed in the stack trace. The system has been restarted so ensure there were no driver/kernel/userland inconsistencies.

The full stacktrace follows

Thread 1 "demo" received signal SIGSEGV, Segmentation fault.
0x00007ffff7587f70 in ?? () from /usr/lib/libnvidia-glsi.so.495.44
(gdb) bt
#0  0x00007ffff7587f70 in ?? () from /usr/lib/libnvidia-glsi.so.495.44
#1  0x00007ffff758de0e in _nv004glsi () from /usr/lib/libnvidia-glsi.so.495.44
#2  0x00007ffff6236339 in ?? () from /usr/lib/libnvidia-eglcore.so.495.44
#3  0x00007ffff6228d05 in ?? () from /usr/lib/libnvidia-eglcore.so.495.44
#4  0x00007ffff6249c30 in ?? () from /usr/lib/libnvidia-eglcore.so.495.44
#5  0x00007ffff796d66f in ?? () from /usr/lib/libEGL_nvidia.so.0
#6  0x00007ffff796d737 in ?? () from /usr/lib/libEGL_nvidia.so.0
#7  0x00007ffff79d9a75 in ?? () from /usr/lib/libEGL_nvidia.so.0
#8  0x00007ffff79d9aed in ?? () from /usr/lib/libEGL_nvidia.so.0
#9  0x00007ffff796ca79 in ?? () from /usr/lib/libEGL_nvidia.so.0
#10 0x00007ffff79d9d4f in ?? () from /usr/lib/libEGL_nvidia.so.0
#11 0x00007ffff7dae4a7 in __run_exit_handlers () from /usr/lib/libc.so.6
#12 0x00007ffff7dae64e in exit () from /usr/lib/libc.so.6
#13 0x00007ffff7d96b2c in __libc_start_main () from /usr/lib/libc.so.6
#14 0x00005555555561ce in _start ()

Reproduction steps as follows:

  1. Reboot
  2. Start an X11 session.
  3. Start a nested gnome-shell session with gnome-shell --nested --wayland
  4. Start the sample application
  5. In the gnome session click the application name (Unknown) in the title bar and select “Quit”
  6. Observe the segfault inside nvidia drivers userland.
  7. Observe kernel panic after attempting a reboot due to nvidia drivers kernel modules.

Test environment:
Archlinux 5.15.2-arch1-1
nvidia-utils 495.44-5
nvidia 495.44-9
egl-wayland 1:1.1.9+2+gdaab854-1
gnome-shell 1:41.1-1
xorg-server 21.1.1-3
Test application: GitHub - kkartaltepe/wayland-egl-simple: It's as simple as I could make it.
GTX 1070, no other graphics devices.

Note: The sample application is leaking a number of wayland and egl resources on exit, but I would expect the driver to be robust against such a common case as such crashes do not occur in similarly leaky x11/glx applications