Cannot successfully update driver from v378.13 to v381.22...

Hi all,

I have servers running Ubuntu Linux 16.04.2 that have (4) GeForce GTX 1080 Ti GPUs installed in each server. I brought them up installing driver v378.13 via the official NVIDIA-Linux-x86_64-378.13.run script. They have been working fine, except today when one server experienced a strange hang (‘nvidia-smi’ as well as a Python script that pulls GPU watts via use of the ‘pynvml’ wrapper module both hung.) So I looked at the driver page, and saw there is a new driver available (v381.22). However, when I went to run the newer NVIDIA-Linux-x86_64-381.22.run file, it terminated with an error “Unable to load the kernel module ‘nvidia.ko’.” I tried many things to overcome this, but nothing has worked to resolve this issue. I will attach the ‘nvidia-installer.log’ and the ‘nvidia-bug-report.log.gz’ to this post, in hopes that someone more knowledgeable can assist me…

Thanks,
Will
nvidia-installer.log (24.8 KB)
nvidia-bug-report.log.gz (327 KB)

Can anyone from NVIDIA support take a look at this issue please?

You should never install NVIDIA drivers directly unless you totally understand what they do to your system - and they do quite a lot.

They replace libglx and various OpenGL libraries, so whenever you update the X.org server package or Mesa* libraries you risk totally breaking your system unless you reinstall the said packages.

There are NVIDIA drivers already packaged for your distro. Please do use them.

Hi Artem,

These servers are not running a GUI/X interface - the GPUs are solely used for computation. The distro-provided drivers are too old to support these newer GPUs, hence installing the ones from NVIDIA (which should and mostly do work fine, excepting this bug…)

Ubuntu indeed contains the most recent stable NVIDIA drivers:

https://launchpad.net/~graphics-drivers/+archive/ubuntu/ppa

I don’t know what you’re talking about.

Again, do not attempt to install NVIDIA drivers manually unless you know how they must be installer properly.

NVIDIA drivers from this PPA may require you to install some xorg-x11 packages but it’s not a big deal - I guess you have installed them anyways.

How does one “know how to install them properly”? I read the README and the shell script installer - that enough??

It would be great if someone from NVIDIA could chime in here…

NVIDIA drivers are not meant to be installed manually by inexperienced users. /Thread

Artem, you may be “experienced”, but are an unhelpful ass. Good day sir.

I’ll take this up on other channels.

Hi WillDennis,
I see nouveau driver is loaded in you system, Please blacklist it :

You can add Nouveau Driver in /etc/modprobe.d/blacklist.conf file. OR create file like /etc/modprobe.d/disable-nouveau.conf with below entries
blacklist nouveau
options nouveau modeset=0

  • And replace kernel parameters : vga=0 rdblacklist=nouveau nouveau.modeset=0
  • Reboot

Latest Long Lived Branch version: 375.66 and Latest Short Lived Branch version: 381.22 you can get it from http://www.nvidia.com/object/unix.html. You can also test Current beta release: https://devtalk.nvidia.com/default/topic/1016125

Make sure no any application using nvidia module. For that you can uninstalled earlier driver with nvidia-uninstall , reboot system and then try fresh driver installation.

If you don’t need any GUI/X or opengl libs on you system. Then you can install driver with --no-opengl-files

I think you are running kernel 4.4.0-77-generic . Make sure you have same version of linux-headers-4.4.0-77-generic and linux-headers-4.4.0-77 installed packages.

What is the grub2 alternative to “vga=0”? vga= is deprecated in grub2.

Edit: maybe gfxpayload=text?

https://wiki.archlinux.org/index.php/GRUB/Tips_and_tricks#Setting_the_framebuffer_resolution
https://www.gnu.org/software/grub/manual/html_node/gfxpayload.html
https://www.gnu.org/software/grub/manual/html_node/gfxpayload.html

Thank you, sandipt.