[Debian] cudaErrorInsufficientDriver with 390.77 on GTX 760 with CUDA 9.1

christian.haschek · August 6, 2018, 8:34pm

I’m having troubles getting cuda 9.1 to run on a GTX 760 and 390.77 drivers. This is running on a headless Debian 9 (freshly installed for this project). Installation of the drivers worked without a problem from the official debian backport repos.

Using the same technique I was able to get a GTX 970 to run but the GTX 760 doesn’t like me as it seems.

When I run nvidia-smi I get the expected output

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.77                 Driver Version: 390.77                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 760     Off  | 00000000:01:00.0 N/A |                  N/A |
|  0%   41C    P0    N/A /  N/A |      0MiB /  1999MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0                    Not Supported                                       |
+-----------------------------------------------------------------------------+

I am able to compile any example program but all fail with the following error.

CUDA error at …/…/common/inc/helper_cuda.h:1160 code=35(cudaErrorInsufficientDriver) “cudaGetDeviceCount(&device_count)”

Am I missing something?

Robert_Crovella · August 6, 2018, 8:37pm

That may be the issue. Distro-maintained repos sometimes do not include all the driver pieces needed to support CUDA development. I usually recommend installation of drivers from official NVIDIA sources.

christian.haschek · August 7, 2018, 8:10am

Thanks for your reply!

I reinstalled the whole system (just to have a clean start) and installed the official drivers.

Sadly the exact same error occurs. I am able to compile all applications (including CUDA samples) but at runtime I’m presented with the following error:

CUDA Error: CUDA driver version is insufficient for CUDA runtime version

or

root@violet:~/NVIDIA_CUDA-9.1_Samples/bin/x86_64/linux/release# ./clock 
CUDA Clock sample
CUDA error at ../../common/inc/helper_cuda.h:1160 code=35(cudaErrorInsufficientDriver) "cudaGetDeviceCount(&device_count)"

I even played around installing Debian with and without a gui. Nothing changes

nvidia-smi output:

root@violet:~# nvidia-smi 
Tue Aug  7 10:09:26 2018       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.77                 Driver Version: 390.77                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 760     Off  | 00000000:01:00.0 N/A |                  N/A |
| 20%   35C    P8    N/A /  N/A |     23MiB /  1999MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0                    Not Supported                                       |
+-----------------------------------------------------------------------------+

Seems like the (pretty old) card GTX 760 is not supported either by cuda or some part of the drivers necessary for cuda

Robert_Crovella · August 7, 2018, 10:46am

what process are you using exactly to install CUDA 9.1 ? Is it the runfile method or the package manager method? Which installer file did you download to use?

christian.haschek · August 7, 2018, 11:15am

I’m using the runfile with the parameter --override

cd /tmp/
wget https://developer.nvidia.com/compute/cuda/9.1/Prod/local_installers/cuda_9.1.85_387.26_linux
chmod +x cuda_9.1.85_387.26_linux
./cuda_9.1.85_387.26_linux --override

My choices are:

accept the license
YES if asked to install on unsupported configuration
NO when asked if you want to install the CUDA drivers
YES when asked about the toolkit. Default location will be fine
YES when asked about the symlink
YES when asked about the samples. Default location is ok

The exact same procedure works for all other Nvidia cards I’ve tested (GTX 970 and GTX 1080). Only the GTX 760 doesn’t work.

Robert_Crovella · August 8, 2018, 12:10am

That process installs the 387.26 driver. How did you get 390.77 ?

did you remove the nouveau driver?

christian.haschek · August 8, 2018, 7:10am

I installed the 390.77 linux driver by downloading the run file from the nvidia drivers page. Before installing I blacklisted the nouveau driver in the kernel, yes.

I didn’t install the driver from the cuda installer directly because I read that that doesn’t work for debian.

Is that wrong?

Robert_Crovella · August 8, 2018, 2:14pm

what is the result of:

dmesg |grep NVRM

and

lsmod |grep nouv

christian.haschek · August 8, 2018, 5:15pm

~/ dmesg |grep NVRM
[    2.295149] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  390.77  Tue Jul 10 18:28:52 PDT 2018 (using threaded interrupts)

~/ lsmod |grep nouv

No output for the last command.

The drivers should be fine as the setup process was the same for the GTX 970 and GTX 1080 which both worked like charm. Maybe the GTX 760 ist just unsupported?

Robert_Crovella · August 8, 2018, 5:43pm

The 760 is supported by CUDA 9.x
It is a Kepler GPU.

Anyway if it were not supported, driver version is insufficient for runtime version is not the error message you would get.

Since debian is not an officially supported OS for CUDA, it’s possible there is some interaction there, but that seems unlikely to me.

I assume there is no chance you have installed CUDA 9.2 at any point?

What is the output of:

nvcc --version

Otherwise I am stumped.

christian.haschek · August 8, 2018, 6:25pm

No I never had CUDA 9.2, I freshly installed the OS for playing with the GPU.

nvcc --version outputs

~/ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Nov__3_21:07:56_CDT_2017
Cuda compilation tools, release 9.1, V9.1.85

Which OS is officially supported for CUDA? I can try a different OS

Robert_Crovella · August 8, 2018, 6:28pm

the supported configurations are contained in the linux install guide. For CUDA 9.2, that is here:

[url]https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#system-requirements[/url]

You can find similar documents from the legacy releases linked from the legacy release download page.

christian.haschek · September 26, 2018, 8:12pm

I finally tested it on Ubuntu 16.4 and it works. So there was just a problem with Debian with that specific card. Thanks for your support