nvidia-drivers-430.09 causes Xorg segfault at start

After an upgrade to 430.09, I can no longer start Xorg with the following message in the Xorg log:

[ 20.128] (II) Initializing extension NV-GLX
[ 20.128] (II) Initializing extension NV-CONTROL
[ 20.128] (II) Initializing extension XINERAMA
[ 20.130] (EE)
[ 20.130] (EE) Backtrace:
[ 20.132] (EE) 0: /usr/libexec/Xorg (xorg_backtrace+0x4d) [0x56419accfacd]
[ 20.132] (EE) 1: /usr/libexec/Xorg (0x56419ab25000+0x1ae769) [0x56419acd3769]
[ 20.132] (EE) 2: /lib64/libpthread.so.0 (0x7f0b947ce000+0x148b0) [0x7f0b947e28b0]
[ 20.132] (EE) 3: /lib64/libc.so.6 (memcpy+0x1f) [0x7f0b946a923f]
[ 20.132] (EE) 4: /usr/lib64/libnvidia-glcore.so.430.09 (0x7f0b920c5000+0x1188d69) [0x7f0b9324dd69]
[ 20.132] (EE) 5: /usr/lib64/libnvidia-glcore.so.430.09 (0x7f0b920c5000+0x1188ecd) [0x7f0b9324decd]
[ 20.132] (EE) 6: /usr/lib64/libnvidia-glcore.so.430.09 (0x7f0b920c5000+0xe72ed8) [0x7f0b92f37ed8]
[ 20.132] (EE) 7: /usr/lib64/xorg/modules/extensions/libglxserver_nvidia.so (0x7f0b9080a000+0x8c2d42) [0x7f0b910ccd42]
[ 20.132] (EE)
[ 20.133] (EE) Segmentation fault at address 0x7f0b911c6000
[ 20.133] (EE)
Fatal server error:
[ 20.133] (EE) Caught signal 11 (Segmentation fault). Server aborting
[ 20.133] (EE)
[ 20.133] (EE)
Please consult the The X.Org Foundation support
at http://wiki.x.org
for help.
[ 20.133] (EE) Please also check the log file at “/var/log/Xorg.0.log” for additional information.
[ 20.133] (EE)

X.Org X Server 1.20.4
X Protocol Version 11, Revision 0
[ 17.166] Build Operating System: Linux 4.20.13-gentoo-zfs x86_64 Gentoo
[ 17.166] Current Operating System: Linux PF16W6Y2 4.20.17-gentoo #11 SMP PREEMPT Thu Apr 25 09:39:24 MSK 2019 x86_64
[ 17.166] Kernel command line: BOOT_IMAGE=/system/boot@/vmlinuz-4.20.17-gentoo root=ZFS=fast/system ro spectre_v2=off nopti resume=PARTUUID=a1105dae-9cb4-4b37-984f-b1a64836f798 scsi_mod.use_blk_mq=1 init=/usr/lib/systemd/systemd vconsole.font=ter-v32n rd.shell

PF16W6Y2 /var/log # lspci | grep -i vga
01:00.0 VGA compatible controller: NVIDIA Corporation GP107GLM [Quadro P2000 Mobile] (rev a1)

Please run nvidia-bug-report.sh as root and attach the resulting .gz file to your post. Hovering the mouse over an existing post of yours will reveal a paperclip icon.
https://devtalk.nvidia.com/default/topic/1043347/announcements/attaching-files-to-forum-topics-posts/

Please find the output of nvidia-bug-report.sh attached
nvidia-bug-report.log.gz (1.11 MB)

Which USE flags did you use for the Xserver, i.e. what’s the output of
emerge -pv xorg-server

PF16W6Y2 ~ # emerge -pv xorg-server

These are the packages that would be merged, in order:

Calculating dependencies… done!
[ebuild R ] x11-base/xorg-server-1.20.4:0/1.20.4::gentoo USE=“glamor ipv6 suid systemd udev xorg -debug -dmx -doc (-elogind) -kdrive -libressl -minimal (-selinux) -static-libs -unwind -wayland -xcsecurity -xephyr -xnest -xvfb” 0 KiB

Exact same error here.

# uname -a
Linux xxx 4.19.27-gentoo-r1 #1 SMP Fri Apr 26 02:47:08 CEST 2019 x86_64 Intel(R) Core(TM) i7-8700K CPU @ 3.70GHz GenuineIntel GN

I’m using xorg-server-1.20.3.
With nvidia-drivers-418.56 everything is fine.

# emerge -pv xorg-server

These are the packages that would be merged, in order:

Calculating dependencies... done!
[ebuild   R    ] x11-base/xorg-server-1.20.3:0/1.20.3::gentoo  USE="glamor ipv6 kdrive suid systemd udev xephyr xorg -debug -dmx -doc -libressl -minimal (-selinux) -static-libs -unwind -wayland -xcsecurity -xnest -xvfb" 0 KiB

Total: 1 package (1 reinstall), Size of downloads: 0 KiB

Same here

# uname -a
Linux Laptop-PC 5.0.9-gentoo #2 SMP Fri Apr 26 22:42:23 EDT 2019 x86_64 Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz GenuineIntel GNU/Linux
# emerge -pv xorg-server

These are the packages that would be merged, in order:

Calculating dependencies ... done!                              
[ebuild   R    ] x11-base/xorg-server-1.20.4:0/1.20.4::gentoo  USE="glamor ipv6 suid systemd udev wayland xorg -debug* -dmx -doc (-elogind) -kdrive -libressl -minimal (-selinux) -static-libs -unwind -xcsecurity -xephyr -xnest -xvfb" 0 KiB

Total: 1 package (1 reinstall), Size of downloads: 0 KiB

nvidia-bug-report.log.gz (1.09 MB)

Use flags don’t look special, not much differing from what I’m using. For me, the 430 driver is working fine on two systems, kernel 4.9/4.19, xorg 1.20.3. My CFLAGS:

CFLAGS="-O2 -march=core-avx2 -pipe"

Are you configured in /etc/X11 to use Nvidia PRIME/OPTIMUS?

Two systems running, one desktop, one optimus notebook using prime.

After some troubleshooting, it seems that building nvidia-drivers with the “compat” USE flag seems to lead to the segfault. Building nvidia-drivers with USE="-compat" works around the segfault.

Nice find. The non-glvnd compat libraries are going to be dropped in the next driver release, so the “compat” use flag will also vanish, anyway.
https://devtalk.nvidia.com/default/topic/1032650/linux/unix-graphics-feature-deprecation-schedule/

Are all people with this problem (wilson.meier, nsane457) using ZFS or not?

Unlikely, there were also users in the gentoo forums with the same problem, the compat libs being needed for bumblebee’s primus bridge.
This non-Gentoo one looks the same:
https://devtalk.nvidia.com/default/topic/1051151/linux/the-system-is-running-in-low-graphics-mode-in-ubuntu-16-04/

I do use ZFS in my laptop

Does the problem reproduce if you use the .run installer, or is it only with the Gentoo “package”?

Can you all please confirm the flags being used while installing driver that triggers the problem.

Yes, i’m using ZFS too.

# emerge -av \=x11-drivers/nvidia-drivers-430.09

These are the packages that would be merged, in order:

Calculating dependencies... done!
[ebuild     U ~] x11-drivers/nvidia-drivers-430.09:0/430::gentoo [418.56:0/418::gentoo] USE="X acpi compat driver gtk3 kms multilib tools uvm -static-libs -wayland" ABI_X86="32 (64) (-x32)" 0 KiB

I’m not using zfs.

Yes. If I uninstall the Gentoo ebuild, unpack the .run installer, and execute

./nvidia-installer --no-glvnd-glx-client --no-glvnd-egl-client

and rebuild the kernel initramfs with dracut, the same segfault is reproduced.
I’ll upload the associated nvidia-bug-report.

“–no-glvnd-glx-client --no-glvnd-egl-client” when installing via nvidia-installer and the “compat” USE flag when installing via Gentoo portage.

EDIT: Both “–no-glvnd-glx-client --no-glvnd-egl-client” trigger the segfault but I haven’t tested if just one of the flags was necessary.
nvidia-bug-report.log.gz (1.12 MB)