I had a working .deb PM install of CUDA 8.0 on Ubuntu 16.04. Later, I added the ppa:graphics-drivers/ppa for apt-get updating of the drivers.
I upgraded the graphics driver (probably to version 378) with apt-get upgrade. This broke CUDA, nvidia-smi, etc.
I decided to do a full clean install of the CUDA- and nvidia-drivers stack to fix this.
The CUDA- and nvidia-stack were removed with “apt-get autoremove --purge cuda* nvidia-*”. I manually removed all config and remaining directories.
I redid the CUDA .deb install as described in the Installation Guide http://docs.nvidia.com/cuda/cuda-installation-guide-linux/, but I am still getting the same errors.
The current driver shows as with the command ‘cat /proc/driver/nvidia/version’:
NVRM version: NVIDIA UNIX x86_64 Kernel Module 367.57 Mon Oct 3 20:37:01 PDT 2016
GCC version: gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4)
nvidia-smi: Failed to initialize NVML: Driver/library version mismatch
My questions are the following:
- Are there incompatibilities between the graphics-drivers ppa repo and CUDA .deb install?
- Nvidia does not mention the required or even preferred type of driver install in the install docs:
What is the most stable way to install? .run-file graphics driver + .run-file cuda? Or both with package manager .deb method?
- Is there a way I can use the graphics-drivers/ppa with CUDA or is this ill-advised?
- Can I fix this mess without rebooting the server?
My best guess is to remove the graphics-drivers/ppa, purge the whole stack again, do a full reinstall with the .deb-files.