I can’t get cuda to work on my ubuntu 22.04. My GPU is a Geforce GT 710.
My system is a fresh install. Here’s a minimal example (taken from here):
#include <cuda_runtime.h>
#include <cstdio>
#include <cstdlib>
int main() {
int deviceCount;
cudaError_t error_id = cudaGetDeviceCount(&deviceCount);
if (error_id != cudaSuccess) {
printf("cudaGetDeviceCount returned %d: %s\n", (int)error_id, cudaGetErrorString(error_id));
exit(EXIT_FAILURE);
}
}
Executing the following command I get the error:
$ g++ -o min -I/usr/local/cuda/include min.cpp -L/usr/local/cuda/lib64 -lcudart
$ ./min
cudaGetDeviceCount returned 35: CUDA driver version is insufficient for CUDA runtime version
Any ideas on how to fix this?
Here are some things I tried:
$ nvidia-smi
Thu Mar 14 11:45:03 2024
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.239.06 Driver Version: 470.239.06 CUDA Version: 11.4 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:01:00.0 N/A | N/A |
| 50% 48C P0 N/A / N/A | 488MiB / 1999MiB | N/A Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
This was with the 470 driver from the repository.
I tried the driver from nvidia directly:
I executed the following
sudo apt install nvidia-cuda-toolkit
sudo apt install nvidia-utils-470
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb
sudo dpkg -i cuda-keyring_1.1-1_all.deb
sudo apt-get update
sudo apt-get -y install cuda
At this point I rebooted, and during boot the following error came up:
[ 7.001282] nvidia: loading out-of-tree module taints kernel.
[ 7.001288] nvidia: module license 'NVIDIA' taints kernel.
[ 7.001289] Disabling lock debugging due to kernel taint
[ 7.001292] nvidia: module verification failed: signature and/or required key missing - tainting kernel
[ 7.001292] nvidia: module license taints kernel.
[ 7.110035] nvidia-nvlink: Nvlink Core is being initialized, major device number 509
[ 7.110041] NVRM: The NVIDIA GeForce GT 710 GPU installed in this system is
NVRM: supported through the NVIDIA 470.xx Legacy drivers. Please
NVRM: visit http://www.nvidia.com/object/unix.html for more
NVRM: information. The 550.54.14 NVIDIA driver will ignore
NVRM: this GPU. Continuing probe...
[ 7.111370] NVRM: No NVIDIA GPU found.
[ 7.111707] nvidia-nvlink: Unregistered Nvlink Core, major device number 509
It seems that driver 550 was somehow installed instead of the 470 which is supported for my GPU. I reinstalled 470 with
sudo apt install nvidia-driver-470
And now I get the same error.
NVIDIA Accelerated Graphics Driver for Linux-x86_64 (470.182.03)
ERROR: An NVIDIA kernel module 'nvidia-drm' appears to already be loaded in
your kernel. This may be because it is in use (for example, by an X
server, a CUDA program, or the NVIDIA Persistence Daemon), but this
may also happen if your kernel was configured without support for
module unloading. Please be sure to exit any programs that may be
using the GPU(s) before attempting to upgrade your driver. If no
GPU-based programs are running, you know that your kernel supports
module unloading, and you still receive this message, then an error
may have occurred that has corrupted an NVIDIA kernel module's usage
count, for which the simplest remedy is to reboot your computer.
OK
NVIDIA Accelerated Graphics Driver for Linux-x86_64 (470.182.03)
ERROR: Installation has failed. Please see the file
'/var/log/nvidia-installer.log' for details. You may find
suggestions on fixing installation problems in the README available
on the Linux driver download page at www.nvidia.com.
OK
NVIDIA Accelerated Graphics Driver for Linux-x86_64 (470.182.03)
ERROR: You appear to be running an X server; please exit X before
installing. For further details, please see the section INSTALLING
THE NVIDIA DRIVER in the README available on the Linux driver
download page at www.nvidia.com.
OK
Basically nvidia driver didn’t work so I reverted back to the original one. Still have the above issue though, which I don’t know if the nvidia driver would solve. Any ideas?