Unable to install newer version of CUDA (Failed to initialize NVML: Driver/library version mismatch)

Hi,

I am a newbie here, so please bear with me. I am trying to upgrade the CUDA version but when I install the newest version the driver that gets installed (495) doesn’t seem to be compatible with my computer. The screen display gets all messed up and nvidia-smi gives this error: Failed to initialize NVML: Driver/library version mismatch. Rebooting does nothing, problem is still there.

When I look into the recommended driver for my computer is 470. So I uninstalled and installed 470 and nvidia-smi now shows:

±----------------------------------------------------------------------------+
| NVIDIA-SMI 470.57.02 Driver Version: 470.57.02 CUDA Version: 11.4 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce … On | 00000000:03:00.0 N/A | N/A |
| 21% 45C P5 N/A / N/A | 213MiB / 980MiB | N/A Default |
| | | N/A |
±------------------------------±---------------------±---------------------+

±----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
±----------------------------------------------------------------------------+

If I do a simple sudo apt install nvidia-cuda-toolkit, I end up with version 9.1. I would like to upgrade it and my gpu is still supported. I have a GeForce GTX 650 and I am running on Ubuntu 18.04.

My ultimate goal is to get Pytorch to be gpu enabled, as it currently can’t with CUDA 9.1.

Any help would be super appreciated! I’m quite lost here.

Hi,

Yes, there are a few things going on here.

Your GTX 650 is a Cuda Capability 3.0 Kepler card, which is not supported by drivers above 470.X.

Cuda 11.4 does not support Kepler 3.0, the last version that does is Cuda 10.2.

There’s a good chance that the current version of Pytorch may no longer support Kepler 3.0 cards. I had a brief look, but could not find an obvious reference. If you aren’t building it, then a package release version built with 10.2 may be a place to start.

I have no experience with Pytorch.

Thank you so much for your reply!!

I went ahead and installed CUDA 10.2 following the commands NVIDIA gives for my system (using local installation). However, CUDA is not getting updated and I can see it deletes the driver and installs the 495 version which is not compatible with my computer.

Any idea why this is happening or what to do to install the 10.2 correctly?

I finally figured out that I need to do the .run installation. Using that and other instructions online, I managed to install CUDA 10.2 and keep the driver version I wanted.