I recently upgraded to Ubuntu 16.04 (kernel 4.4.0-97) and am having major issues with the Nvidia drivers. I have previous experience with fixing the driver install after kernel upgrades (e.g. boot into recovery mode, reinstall driver), but none of the usual tricks are working as the boot crashes with a blank screen, with no opportunity for a TTY, if nvidia-387, nvidia-384, or nvidia-375 are installed.
The only “solution” I’ve found is to purge nvidia-*, and then I can boot with no issues.
What additional information can I provide to assist in troubleshooting?
The Xorg.0.log file can give you some information. Or run nvidia-bug-report.sh to collect all info. See: If you have a problem, PLEASE read this first - Linux - NVIDIA Developer Forums
I am not able to ssh into 4.4.0-96 or 4.4.0-97, nor get into recovery mode. However, I am able to boot into recovery on 4.4.0-59 and was able to run nvidia-bug-report.sh:
https://s3-us-west-2.amazonaws.com/open-science/public/nvidia-bug-report.log.gz
I also tried uninstalling the nvidia drivers, booting into 4.4.0-97, stopping X, install nvidia-387, and then ran nvidia-bug-report.sh:
https://s3-us-west-2.amazonaws.com/open-science/public/nvidia-bug-report-after-stop-x-then-install.log
As soon as I run startx – -logverbose 6, the system crashes and I cannot access a shell
Many thanks for the help!
You have nvidiafb in your kernel config, that’s blocking the nvidia driver. If you compile your own kernel, disable it in kernel config. Otherwise blacklist the module.
To get a bit more info, can you please repeat this:
I also tried uninstalling the nvidia drivers, booting into 4.4.0-97, stopping X, install nvidia-387, >and then ran nvidia-bug-report.sh:
but before running nvidia-bug-report.sh load the kernel driver with
modprobe nvidia.
I added “blacklist nvidiafb” to /etc/modprobe.d/blacklist. By the way, I am installing using “sudo apt install nvidia-387”. At one point I did try installing through the Additional Packages section of the Ubuntu gui.
Here’s the new log as collected with your instructions to do modprobe nvidia first:
https://s3-us-west-2.amazonaws.com/open-science/public/nvidia-bug-report-post-modprobe.log.gz
The kernel driver loads fine and the gpu is working but the persistence-daemon is complaining about missing or inaccessible device files. After you loaded the nvidia modules, please post the output of
ls /dev/nvidia* -l
$ ls /dev/nvidia* -l
crw-rw-rw- 1 root root 195, 0 Oct 20 17:17 /dev/nvidia0
crw-rw-rw- 1 root root 195, 255 Oct 20 17:17 /dev/nvidiactl
crw-rw-rw- 1 root root 241, 0 Oct 20 17:17 /dev/nvidia-uvm
I was able to ssh in and keep the connection alive after running startx -- -logverbose 6
. Here is the bug report:
https://s3-us-west-2.amazonaws.com/open-science/public/nvidia-bug-report-xcrash.log.gz
I am delighted to report that after many hours of troubleshooting, that the problem is resolved!! Many thanks for your help, jkfloris and generix.
The key missing step was to run, sudo update-initramfs -u
after installing the driver, and everything worked!