I’ve been trying to figure out if there are safe methods for installing the CUDA Toolkit, given that CUDA drivers are already installed for current hardware (1080Ti). I’ve heard that the versions need to be matched, but I’m not sure how that is confirmed in advance, or what the consequences might be.
I did immediately encounter a problem when executing the Toolkit’s .run file. Hence the caution about the driver being installed.
If I were to install via “sudo apt install nvidia-cuda-toolkit”, would there be any inherent risk of version conflicts or other problems? Would it recognize that the driver is already loaded, and leave it alone?
nvidia-cuda-toolkit is not a NVIDIA provided or managed method. That would be based on using packages provided from some other source.
If you want to use NVIDIA provided methods, you can use a runfile installer and deselect the option to install the driver. All CUDA runfile installers give you this option. If you use this, it will leave your installed driver alone, and it then becomes your responsibility to make sure that your installed driver is compatible with your choice of CUDA toolkits to install.
A sufficient compatibility matrix is given in the CUDA release notes:
https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html scroll down to table 1
If you use a NVIDIA deb archive and nvidia package manager method to install, the normal (ubuntu) package manager command line would be
sudo apt-get install cuda
if you want to leave the installed driver alone and not install a driver, the command to use is
sudo apt-get install cuda-toolkit
This is also covered in the linux install guide, and the same caveats apply:
get your nvidia installers from:
Thanks, Robert. This is starting to make a bit more sense now:
I didn’t realize that “nvidia-cuda-toolkit” was not official Nvidia. I’d prefer to stay with NVidia-issued drivers, etc.
When attempting to install the .run version of the file, I got the message:
“Existing package manager installation of the driver found. It is strongly
recommended that you remove this before continuing.”
I had attributed that to another cause, but perhaps that implies that the current driver is not official NVidia either (?) That was the very first and only message to appear. I did not see any menu options for deselecting driver installation. So that would make sense if an inherently incompatible driver was found.
So if I understand this now, there is no way to avoid uninstalling the current driver. I was rather dreading that.
Current driver is:
$ cat driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module 430.26 Tue Jun 4 17:40:52 CDT 2019
GCC version: gcc version 7.4.0 (Ubuntu 7.4.0-1ubuntu1~18.04.1)
I wouldn’t be able to help you unwind whatever you’ve done historically to your linux install.
There are instructions to clean out old drivers. If that fails, then you could also consider reloading the OS.
However, I’m not sure there is any reason not to try using the 430.26 driver you already have installed. It should work with any currently available version of CUDA toolkit.
I’m certain the runfile installer offers an option. it does often take a long time to get to the screen where the option is offered. Your description of what you’ve done, what the exact output was, and how long you waited are insufficient for me to tell anything for certain.
Just a clarification question, I got the same error when trying to update/install cuda-toolkit 11.0.3, I currently have cuda version 10.2 and nvidia driver 440.95.01, for a RTX 2070 Super. This is on a fresh install of ubuntu 18.04. If I run the runfile installer past the error (continue, rather than abort) I have the option to install the required (i think according to https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html scroll down to table 1 ) driver 450.51.06. Will this tool clean out and reinstall/replace the nvidia drivers/cuda, or is that something that I need to do manually? I’m not too familiar with driver updating/removing on linux.
I strongly recommend to install/manage your packages via your Linux package management systems. it is a lot cleaner to make changes.
what exactly was the error message when you attempted to install nvidia-cuda-toolkit?
The error is “Existing package manager installation of the driver found. It is strongly recommended that you remove this before installing”.
I wasn’t trying to install the “nvidia-cuda-toolkit” as (if i’m not mistaken) that isn’t the official toolkit. I was trying to install the cuda-toolkit with the runfile nvidia supplied.
I was more interested in the error message you saw when you ran
sudo apt-get install nvidia-cuda-toolkit
Unless you’re referring to MaxV, I never ran that command.
Unless you’re referring to MaxV, I never ran that command.
yes and no. yes I was referring to MaxV’s previous post because he attempted to use apt-get to install the toolkit, but also no, I suggested anyone who had issues with installing drivers to use apt-get if possible - the benefit of using apt-get is that it handles conflicts/dependencies automatically (with prompt or not). when it sees a package that is in conflict, it tells you what to do next - to update, to remove or install additional dependencies, this avoids the confusion you met earlier by running the .run file, because it only tells you something is wrong, but does not tell you what to do to fix.
secondly, to fix the complains of your .run file, you should run
sudo apt-get remove --purge nvidia\*
this removes the existing driver, if present, and all dependent packages and allow you to reinstall the toolkit/driver in a clean slate.
[INFO]: Driver not installed.
[INFO]: Checking compiler version…
[INFO]: gcc location: /usr/bin/gcc
[INFO]: gcc version: gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)
[INFO]: Initializing menu
[INFO]: Setup complete
[INFO]: Components to install:
[INFO]: Executing NVIDIA-Linux-x86_64-455.23.05.run --ui=none --no-questions --accept-license --disable-nouveau --no-cc-version-chec$
[INFO]: Finished with code: 256
[ERROR]: Install of driver component failed.
[ERROR]: Install of 455.23.05 failed, quitting