Xorg and 'modprobe nvidia-modeset' both stall out and break the system. RTX-2080 (and entire system) unusable.

For what’s probably been more than a year now, I have been unable to use nVidia’s proprietary drivers. At first I thought this was due to my graphics card (a GTX 780) being outdated, and so I switched to the open source Nouveau drivers, which until this week I’d been using with little issue. In this period I regularly tried to use nVidia’s drivers again, with every new kernel and driver release, and always with the exact same issues.

Which brings me to this week, and my having bought an RTX 2080 as upgrade. Lo and behold, the exact same issues persist. But worse yet, the Nouveau drivers are constantly locking up my PC as well now, making it effectively impossible to use my PC with this new graphics card.

So - in desperation - I’ve finally come here for support.

The problems:

  • Xorg does not work, at all. As soon as it tries to initialize, I get a white non-blinking cursor line on a black screen, and the entire system hangs.
  • From the command line, 'modprobe nvidia-modeset' stalls out, ultimately causing the following dmesg entries:
    systemd-udevd[2431]: nvidia: Worker [2564] processing SEQNUM=2480 is taking a long time
    systemd-udevd[2431]: 0000:01:00.0: Worker [2494] processing SEQNUM=1972 is taking a long time
    systemd-udevd[2564]: Spawned process 'nvidia-udev.sh add' [2616] is taking longer than 59s to complete
    systemd-udevd[2564]: Spawned process 'nvidia-udev.sh add' [2616] timed out after 2min 59s, killing
    systemd-udevd[2431]: nvidia: Worker [2564] processing SEQNUM=2480 killed
    systemd-udevd[2431]: Worker [2564] terminated by signal 9 (KILL)
    systemd-udevd[2431]: nvidia: Worker [2564] failed
    
  • As soon as 'modprobe nvidia-modeset' is run, a systemd-udevd process takes up 100% CPU utilization and permanently stays stuck in that state.

Somewhere in my configuration of kernel, udev, and drivers, something is going horribly wrong. But for the life of me I have not been able to find it. And now that Nouveau can’t really be used, this leaves me with a broken system. I could return to the GTX 780, but then that RTX would be a rather expensive paper weight. ;-)
nvidia-bug-report.log (289 KB)

IIRC, nvidia-udev.sh is a gentoo script started by a udev rule mis/using nvidia-smi to make sure device nodes are created. On some system that fails due to timing problems. Try disabling the udev rule.

It’s embarrassing that in a year of mucking about I never stumbled onto this, but learning that nvidia-udev.sh isn’t part of the official distribution led me to this solution; https://bugs.gentoo.org/670340#c8

Simply blacklisting the nvidia modules, then loading them myself, solves the issues and lets me properly use the proprietary drivers. Much obliged. :)