Ubuntu 16.04 CUDA 8+driver install problems with mixed sources

RealignedPen · March 30, 2017, 1:06pm

I had a working .deb PM install of CUDA 8.0 on Ubuntu 16.04. Later, I added the ppa:graphics-drivers/ppa for apt-get updating of the drivers.
I upgraded the graphics driver (probably to version 378) with apt-get upgrade. This broke CUDA, nvidia-smi, etc.

I decided to do a full clean install of the CUDA- and nvidia-drivers stack to fix this.

The CUDA- and nvidia-stack were removed with “apt-get autoremove --purge cuda* nvidia-*”. I manually removed all config and remaining directories.

I redid the CUDA .deb install as described in the Installation Guide Installation Guide Linux :: CUDA Toolkit Documentation, but I am still getting the same errors.
The current driver shows as with the command ‘cat /proc/driver/nvidia/version’:

NVRM version: NVIDIA UNIX x86_64 Kernel Module 367.57 Mon Oct 3 20:37:01 PDT 2016
GCC version: gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4)

Errors:
nvidia-smi: Failed to initialize NVML: Driver/library version mismatch
deviceQuery: FAILS

My questions are the following:

Are there incompatibilities between the graphics-drivers ppa repo and CUDA .deb install?
Nvidia does not mention the required or even preferred type of driver install in the install docs:
What is the most stable way to install? .run-file graphics driver + .run-file cuda? Or both with package manager .deb method?
Is there a way I can use the graphics-drivers/ppa with CUDA or is this ill-advised?
Can I fix this mess without rebooting the server?

My best guess is to remove the graphics-drivers/ppa, purge the whole stack again, do a full reinstall with the .deb-files.

Robert_Crovella · March 30, 2017, 2:10pm

There can be, yes. Those driver sources are not maintained by the same group. NVIDIA drivers can be packaged in a variety of ways, for a variety of purposes.

This is because there is no preferred method. Both methods have strengths and weaknesses, and both serve particular purposes. The key is to not mix between the runfile install method and the PM install method, and this is pretty evident from the guide in my opinion.

It might work, it might not. YMMV. Since NVIDIA does not control those packages, anything is possible, and an answer applicable to a particular driver may not be applicable to another. An answer applicable today may not be applicable tomorrow. Given that uncertainty, the strong recommendation, at least for cuda usage, is to install via the instructions in the linux install guide, using binaries provided by nvidia, and that rules out the packages from other sources.

That would be my best guess too. Purge all NVIDIA GPU related packages, regardless of source, and start over strictly following the method provided for in the CUDA linux install guide, and using binaries either from www.nvidia.com or CUDA Toolkit 12.3 Update 1 Downloads | NVIDIA Developer

RealignedPen · March 30, 2017, 2:26pm

Well, I started doing a full .run file-based reinstall but the catch is: there does not seem to be a way to install the driver without rebooting. And this is a problem because we cannot reboot our R&D server right now. But that is another issue.

Topic		Replies	Views
Setup CUDA Toolkit after CUDA drivers are already installed CUDA Setup and Installation	11	40432	December 14, 2021
Cuda 10.0 install claims missing driver, but it is installed. CUDA Setup and Installation	6	2483	May 24, 2019
Nvidia Driver and CUDA Installation Sequence ! CUDA Setup and Installation	12	17321	July 23, 2021
Mismatch in CUDA driver and runtime versions CUDA Setup and Installation ubuntu	6	899	September 3, 2024
Cuda and Nvidia drivers failing to install on ubuntu CUDA Setup and Installation	8	7835	September 11, 2019
Safe to update driver 396.37->54 with CUDA 9.2? Linux	14	2827	October 14, 2021
Install CUDA-9 on Ubuntu 16.04 with the runfile and pre-installed drivers CUDA Setup and Installation	15	58578	February 28, 2020
CUDA installation problems: can't get right drivers for K40c? CUDA Setup and Installation	2	674	April 20, 2018
Problems with CUDA 9.1 in Ubuntu 16.04 CUDA Setup and Installation	36	24290	May 15, 2018
Ubuntu 18.04/GeForce Titan X/Nvidia Driver 4.30.50/Cuda 10.1.243 (driver/cuda library mismatch) Linux	3	1509	October 31, 2019

Ubuntu 16.04 CUDA 8+driver install problems with mixed sources

Related topics