Cuda Installation on Ubuntu 18.04 Failing

Hi Everyone,

I am attempting to install Cuda on new Ubuntu 18.04 installation that has a GeForce GTX 1080 Ti GPU installed.

I started the installation process by attempting to install the Nvidia 410 driver using the following steps:

$ sudo apt update
$ sudo apt upgrade
$ ubuntu-drivers devices
$ sudo add-apt-repository ppa:graphics-drivers/ppa
$ sudo apt install nvidia-driver-410

After the installation completes, I perform a reboot to confirm successful installation by running nvidia-smi however the response is:
NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

I ran sudo lshw -c display, this is the output:
*-display
description: VGA compatible controller
product: GP102 [GeForce GTX 1080 Ti]
vendor: NVIDIA Corporation
physical id: 0
bus info: pci@0000:01:00.0
version: a1
width: 64 bits
clock: 33MHz
capabilities: pm msi pciexpress vga_controller bus_master cap_list rom
configuration: driver=nouveau latency=0
resources: irq:136 memory:c2000000-c2ffffff memory:b0000000-bfffffff memory:c0000000-c1ffffff ioport:4000(size=128) memory:c0000-dffff

Finally I ran sudo nvidia-bug-report.sh and attached the output.

Any assistance to get past this hurdle would be appreciated.

Thanks.
nvidia-bug-report.log.gz (188 KB)

This isn’t the process that NVIDIA documents.

NVIDIA doesn’t maintain the ppa repository. You’re welcome to use it if you wish, of course.

The NVIDIA process is documented at:

http://www.nvidia.com/getcuda

and

https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html

If you want to follow the NVIDIA approach, I recommend reading the above linux install guide, and start with a clean linux install. Alternative, you could read the section on “handle conflicting install methods” carefully, to see if it applies to where you are at.

Restarted with a fresh Linux install and followed these instructions

https://docs.nvidia.com/cuda/archive/10.0/cuda-installation-guide-linux/index.html
(downloaded the .deb repo)

After the installation completes, I perform a reboot to confirm successful installation by running nvidia-smi however the response is:
NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

Attached newest bug report.

Any ideas what is wrong?
nvidia-bug-report.log.gz (116 KB)

which version of CUDA are you attempting to install?

which kernel version is in your 18.04 install?

Attempting to install v 10.0 and the kernel version is 4.15.0-74-generic

The key issue from the logs is here:

[   52.332876] nvidia-nvlink: Nvlink Core is being initialized, major device number 237
[   52.333354] NVRM: The NVIDIA probe routine was not called for 1 device(s).
[   52.333355] NVRM: This can occur when a driver such as:
               NVRM: nouveau, rivafb, nvidiafb or rivatv
               NVRM: was loaded and obtained ownership of the NVIDIA device(s).
[   52.333355] NVRM: Try unloading the conflicting kernel module (and/or
               NVRM: reconfigure your kernel without the conflicting
               NVRM: driver(s)), then try loading the NVIDIA kernel module
               NVRM: again.
[   52.333356] NVRM: No NVIDIA graphics adapter probed!
[   52.333497] nvidia-nvlink: Unregistered the Nvlink Core, major device number 237

This is puzzling because that would be an error I would expect if you were using the runfile install method and didn’t do the nouveau removal. However this should not be necessary for .deb (package manager) install.

If there is something unusual about your linux install or distro, or you are doing something unexpected, then it’s possible that could be an issue. Otherwise I don’t have an explanation.

When people have trouble with one install method, sometimes its possible to get the other install method to work.

You might want to try a runfile installer. There are additional steps/directions needed in this case, including clean out of the previous install, so read the linux install guide carefully.

Reinstalled Linux and attempted the installation using the runfile method instead.

Once the machine rebooted, it wouldn’t launch the desktop, just scrolling error messages.

Restarted and went into Grub and ran in text mode to run the debug tool.

Output is attached.
nvidia-bug-report.log.gz (537 KB)

Hi, I’m just having the exact same issue with the exact same configuration: GTX 1080 Ti, trying to install Cuda 10.0 from an ubuntu 18.04 fresh install.

The installation instructions are the ones you mentioned in your post.

Driver 410 is where the problem starts … I’ve tried the 440, 435, and 415 and both 440 and 435 DO work and install fine. The problem is that the cuda 10.0 .deb file installs its own 410 driver and then that’s where the problem begins. It;s all about the 410 driver not working correctly on this GPU or version of ubuntu. I don’t know. I need the specific 10.0 cuda version to run some software that needs it.

410 also fails when being installed standalone. It finished without errors apparently, but the nvidia-smi does gives that error that he mentioned. Even after reboot.

I’ll try installing from the local .run file, but then some other packages won’t work because they require the .deb …
Have you found any solution to this?

Thank you.
Eduardo

I got my configuration to work by performing the following steps:

One other piece of software I used was TimeShift which allowed me to create restore points so if I broke something along the way I could simply rollback to the last working version and not have to do a complete reinstall. Just make sure to perform a backup after each successful step.

Hope this helps.