I run Ubuntu 18.04, kernel 5.3.0-28 and use two graphics cards, Nvidia Quadro K4200 and GTX 750 Ti. Both screens are connected to K4200 via DVI and DisplayPort and I use GTX 750 Ti for CUDA computations.
It was running well for some months with nvidia driver (don’t remember the version now) and cuda 10.0.
After one recent reboot one of the screens stopped working. Moreover, nvidia-smi produced the following error:
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver.
“Nvidia X Server settings” was displaying an empty window.
Here is the list of the things I already tried:
1) Reinstall drivers
I deleted the nvidia drivers (and also cuda, just to be sure):
sudo apt purge nvidia*
sudo apt purge libnvidia*
sudo apt purge cuda*
sudo apt autoremove
And installed the latest driver (440) that supports both of my graphic cards according to Nvidia:
sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt update
sudo apt install nvidia-driver-440
sudo reboot now
After rebooting the system shows the GUI login scree, but after I enter the password it gets stuck on the purple boot screen.
I can switch to a terminal (Ctrl+Alt+F5), log in there and verify that nvidia-smi works. However, once I try to restart gdm
sudo systemctl restart gdm.service
I get back to infinite login loop in GUI.
I have also tried different driver versions (410, 430) andinstalling drivers via
sudo ubuntu-drivers autoinstall
.
The behaviour is exactly the same.
2) Kernel parameters
I tried booting with the kernel parameter:
nvidia-drm.modeset=1
as suggested here https://askubuntu.com/questions/1048274/ubuntu-18-04-stopped-working-with-nvidia-drivers. Didn’t help.
3) xorg.conf
I tried the solution proposed here https://devtalk.nvidia.com/default/topic/1043405/linux/ubuntu-18-04-headless_390-intel-igpu-after-prime-select-intel-lost-contact-to-geforce-1050ti/post/5293003/#5293003
I set the driver to modesetting in /etc/X11/xorg.conf
Section "Device"
Identifier "GK104GL"
Driver "modesetting"
BusID "PCI:0:5:0"
EndSection
I could login, nvidia-smi worked as expected, however, the mouse pointer had 1-2 second delay making it unusable.
Also, I don’t think this is a proper solution because of 4)
4) Clean install
I made a clean install of Ubuntu 18.04 (same kernel 5.3.0-28) on a new SSD on the same PC. Then installed nvidia driver (440) as described above and everything works as expected (screens, login, nvidia-smi).
This means the issue is not with BIOS or kernel flags.
5) Additional info
I have nvidia profile set in
prime-select
I can switch from nvidia to intel profile, then I am able to login to the desktop, only one screen works then.
With the Nvidia drivers uninstalled I can login, both screens are functional.
Obviously, nvidia-smi and CUDA are not available in that case.
I would be really thankful for any hint that would help in solving this issue.
UPD:
I checked /var/log/Xorg.0.log there are two errors:
[7.219] (EE) /dev/dri/card0: No such file or directory
[7.219] (EE) /dev/dri/card0: No such file or directory
[7.220] (EE) Screen 0 deleted because of no matching cofig section
This errors are not present in the log of the new fresh functioning Ubuntu install.