X won't start with GTX 780Ti card and 450.66 driver

or indeed any version of the 450.xx driver that i have tried. This means I can’t use the card with the Linux 5.8 kernel series.

The message I get, on a white screen is:

Oh no! Something has gone wrong.
A problem has occurred and the system can’t recover.
Please contact a system administrator.

Bug report attached:

nvidia-bug-report.log.gz (1.2 MB)

The only thing I see going wrong in your bug report log is that the X server can’t load its libglx.so module:

(EE) Failed to load /usr/lib64/xorg/modules/extensions/libglx.so: /lib64/libGL.so.1: undefined symbol: __GLXGL_CORE_FUNCTIONS

but I don’t think that should be fatal if you’re not using Mesa.

Can you boot to a text console and try starting an X server with xinit -- -retro, and then run glxinfo? That should tell you whether or not OpenGL is working at all.

I don’t see any SELinux errors in your log, but if you have SELinux enabled please try disabling it to see if the problem goes away. There’s a known problem in older versions of the Fedora selinux-policy package that breaks gdm.

Finally, nvidia-bug-report.sh doesn’t collect information about gdm failures. Can you please check journalctl to see if there are any obvious errors coming from gdm explaining why it’s failing?

glxinfo just says:

glxinfo: symbol lookup error: /lib64/libGL.so.1: undefined symbol: __GLXGL_CORE_FUNCTIONS

so I guess it was fatal after all? (just guessing from what you said - I don’t know anything about this area).

I can’t see anything obvious from gdm in journaltctl. I’ve copy-and-pasted a section around some gdm messages below (and can you tell me what I need to do to disable selinux?):

ep 18 19:40:29 colin audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg=‘unit=dbus-:1.9-org.fedoraproject.SetroubleshootPrivileged@0 comm=“systemd” exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success’
Sep 18 19:40:30 colin setroubleshoot[7512]: SELinux is preventing (r-launch) from rmdir access on the directory /tmp/namespace-dev-ikSQC8/dev. For complete SELinux messages run: sealert -l de5718ac-1be0-447b-b8fc-2131a597d105
Sep 18 19:40:30 colin python3[7512]: SELinux is preventing (r-launch) from rmdir access on the directory /tmp/namespace-dev-ikSQC8/dev.

                                 *****  Plugin catchall (100. confidence) suggests   **************************
                                 
                                 If you believe that (r-launch) should be allowed rmdir access on the dev directory by default.
                                 Then you should report this as a bug.
                                 You can generate a local policy module to allow this access.
                                 Do
                                 allow this access for now by executing:
                                 # ausearch -c '(r-launch)' --raw | audit2allow -M my-rlaunch
                                 # semodule -X 300 -i my-rlaunch.pp

Sep 18 19:40:30 colin setroubleshoot[7512]: SELinux is preventing (r-launch) from rmdir access on the directory dev. For complete SELinux messages run: sealert -l de5718ac-1be0-447b-b8fc-2131a597d105
Sep 18 19:40:30 colin python3[7512]: SELinux is preventing (r-launch) from rmdir access on the directory dev.

                                 *****  Plugin catchall (100. confidence) suggests   **************************
                                 
                                 If you believe that (r-launch) should be allowed rmdir access on the dev directory by default.
                                 Then you should report this as a bug.
                                 You can generate a local policy module to allow this access.
                                 Do
                                 allow this access for now by executing:
                                 # ausearch -c '(r-launch)' --raw | audit2allow -M my-rlaunch
                                 # semodule -X 300 -i my-rlaunch.pp

Sep 18 19:40:31 colin /usr/libexec/gdm-x-session[7171]: () Option “fd” “42”
Sep 18 19:40:31 colin /usr/libexec/gdm-x-session[7171]: (II) event1 - Power Button: device removed
Sep 18 19:40:31 colin /usr/libexec/gdm-x-session[7171]: (
) Option “fd” “45”
Sep 18 19:40:31 colin /usr/libexec/gdm-x-session[7171]: (II) event0 - Power Button: device removed
Sep 18 19:40:31 colin /usr/libexec/gdm-x-session[7171]: () Option “fd” “46”
Sep 18 19:40:31 colin /usr/libexec/gdm-x-session[7171]: (II) event2 - Logitech USB-PS/2 Optical Mouse: device removed
Sep 18 19:40:31 colin /usr/libexec/gdm-x-session[7171]: (
) Option “fd” “47”
Sep 18 19:40:31 colin /usr/libexec/gdm-x-session[7171]: (II) event3 - Logitech USB Keyboard: device removed
Sep 18 19:40:31 colin /usr/libexec/gdm-x-session[7171]: () Option “fd” “48”
Sep 18 19:40:31 colin /usr/libexec/gdm-x-session[7171]: (II) event4 - Logitech USB Keyboard Consumer Control: device removed
Sep 18 19:40:31 colin /usr/libexec/gdm-x-session[7171]: (
) Option “fd” “49”
Sep 18 19:40:31 colin /usr/libexec/gdm-x-session[7171]: (II) event5 - Logitech USB Keyboard System Control: device removed
Sep 18 19:40:31 colin /usr/libexec/gdm-x-session[7171]: (**) Option “fd” “50”
Sep 18 19:40:31 colin /usr/libexec/gdm-x-session[7171]: (II) event7 - Eee PC WMI hotkeys: device removed
Sep 18 19:40:31 colin /usr/libexec/gdm-x-session[7171]: (II) systemd-logind: got pause for 13:66
Sep 18 19:40:31 colin /usr/libexec/gdm-x-session[7171]: (II) systemd-logind: got pause for 13:69
Sep 18 19:40:31 colin /usr/libexec/gdm-x-session[7171]: (II) systemd-logind: got pause for 13:71
Sep 18 19:40:31 colin /usr/libexec/gdm-x-session[7171]: (II) systemd-logind: got pause for 13:67
Sep 18 19:40:31 colin /usr/libexec/gdm-x-session[7171]: (II) systemd-logind: got pause for 13:68
Sep 18 19:40:31 colin /usr/libexec/gdm-x-session[7171]: (II) systemd-logind: got pause for 13:64
Sep 18 19:40:31 colin /usr/libexec/gdm-x-session[7171]: (II) systemd-logind: got pause for 13:65
Sep 18 19:40:31 colin kernel: rfkill: input handler enabled
Sep 18 19:40:31 colin systemd[1]: Started Getty on tty2.
Sep 18 19:40:31 colin audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg=‘unit=getty@tty2 comm=“systemd” exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success’
Sep 18 19:40:31 colin setroubleshoot[7512]: SELinux is preventing firewalld from write access on the directory /tmp/ffi12wr3P. For complete SELinux messages run: sealert -l 272106cd-b9e4-4952-9dcb-1510727c0994
Sep 18 19:40:31 colin python3[7512]: SELinux is preventing firewalld from write access on the directory /tmp/ffi12wr3P.

                                 *****  Plugin catchall (100. confidence) suggests   **************************
                                 
                                 If you believe that firewalld should be allowed write access on the ffi12wr3P directory by default.
                                 Then you should report this as a bug.
                                 You can generate a local policy module to allow this access.
                                 Do
                                 allow this access for now by executing:
                                 # ausearch -c 'firewalld' --raw | audit2allow -M my-firewalld
                                 # semodule -X 300 -i my-firewalld.pp

I changed selinux from permissive to disabled. This does not make any difference.

So it seems the __GLXGL_CORE_FUNCTIONS message is the key. What is the next step?

Problem persists with 450.80.02.

Thanks for testing glxinfo. The NVIDIA driver package doesn’t provide libGL.so.1 itself anymore, that’s supposed to come from the OpenGL Vendor-neutral Library, libglvnd.

From a quick search, it sounds like a corrupted libglvnd install has happened to at least one other Fedora user: https://unix.stackexchange.com/questions/603697/startx-fails-with-libgl-so-1-undefined-symbol-glxgl-core-functions

Maybe the suggested fix from that will help?

sudo dnf reinstall libglvnd-glx

I have already seen that post and tried that remedy - it didn’t help.

I’ll try uninstalling and installing afresh, in case there was something wrong with the reinstall process.

So I tried it again. It didn’t help, but I noticed the reinstall gave several warning messages about empty files in /lib ( /li/lib…OpenCL…, or something like that).

As I elect not to install the 32-bit compatibility files, I deleted these files (one file and three symbolic links), and tried the reinstall again.
Now it works fine.

Thank you for your help.

1 Like

Excellent, I’m glad to hear that it’s working again.

Empty files in places like /lib can often happen if the system crashes or is powered off after installing updated libraries but before the data for those libraries was flushed to disk. I’d recommend looking for a way to verify the contents of everything installed by the package manager. From a quick search, it seems like rpm -Va might do the trick.