cudaGetDeviceCount error 3 (cudaErrorInitializationError)

I have installed CUDA 11.2 and I have Visual Studio 2019 Community Edition.

For your information on my CUDA version, nvcc -V returns:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Sun_Feb_14_22:08:44_Pacific_Standard_Time_2021
Cuda compilation tools, release 11.2, V11.2.152
Build cuda_11.2.r11.2/compiler.29618528_0

The CUDA Wizard works fine, and I have successfully run the QueryDeviceDrv sample, with the following return:

CUDA Device Query (Driver API) statically linked version
Detected 1 CUDA Capable device(s)

Device 0: “Quadro K3100M”
CUDA Driver Version: 10.1
CUDA Capability Major/Minor version number: 3.0
Total amount of global memory: 4096 MBytes (4294967296 bytes)
( 4) Multiprocessors, (192) CUDA Cores/MP: 768 CUDA Cores
GPU Max Clock rate: 706 MHz (0.71 GHz)
Memory Clock rate: 1600 Mhz
Memory Bus Width: 256-bit
L2 Cache Size: 524288 bytes
Max Texture Dimension Sizes 1D=(65536) 2D=(65536, 65536) 3D=(4096, 4096, 4096)
Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Texture alignment: 512 bytes
Maximum memory pitch: 2147483647 bytes
Concurrent copy and kernel execution: Yes with 2 copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Concurrent kernel execution: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
CUDA Device Driver Mode (TCC or WDDM): WDDM (Windows Display Driver Model)
Device supports Unified Addressing (UVA): Yes
Device supports Managed Memory: Yes
Device supports Compute Preemption: No
Supports Cooperative Kernel Launch: No
Supports MultiDevice Co-op Kernel Launch: No
Device PCI Domain ID / Bus ID / location ID: 0 / 1 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
Result = PASS

However, when I run the QueryDevice sample, there is a crash at line 61 of deviceQuery.cpp:
cudaError_t error_id = cudaGetDeviceCount(&deviceCount);

The cudaGetDeviceCount returns error 3 (cudaErrorInitializationError) and devicecount equals zero.

I am stuck on this error, and I would greatly appreciate if you could help me to understand its cause and how to solve it.

Many thanks by advance.

This is your problem. CUDA 11.x supports GPUs with a compute capability of 3.5 or higher, while your GPU has compute capability 3.0. So you would want to look for a new(er) GPU. I would highly recommend a GPU with a least compute capability 6.x, as CUDA 11.x already warns about the deprecation of compute capabilities 3.5, 3.7, and 5.0, which makes it likely that support for these will be removed in the next major CUDA version.

Alternatively you should be able to install CUDA 10.1, which is what the installed CUDA driver supports. I have no personal experience with running a cc3.0 GPU with CUDA 10, as my own cc3.0 GPU is running with CUDA 9.2.

General note: One cannot run CUDA 11.x on top of a driver that supports only CUDA versions up to 10.1.

Many thanks for your quick answer njuffa.

This sheds a bright light to me !! I am just surprised that the QueryDeviceDrv has not mentioned this (possibly some explanatory logs could be output).

I have been looking on the Net for some further explanation, but without success so far. I am a complete newbie on the subject (although I am very interested by it), if you had some Internet resources to indicate for me to improve on that subject of minimum compute capability with regards to CUDA version I would be grateful to you.

In the mean time I will cautiously install CUDA 9.2 (which seems to be still available).

Once again, many thanks.

QueryDeviceDrv talks to the CUDA driver. Your driver supports CUDA 10.1, which supports cc3.0.

If I were you I would try CUDA 10.1 first, and only if that does not work go back to an even older CUDA version. I am saying that because there are also dependencies between CUDA and MSVC, and I don’t recall which CUDA version added support for MSVS 2019. Please consult the Windows Installation Guide for supported hardware and software environments.

1 Like

Many thanks njuffa,

I looked at 9.2 but it does not recognize MSVS 2019.

I am currently striving with 10.1 install by now (twice tried with the network install, failed, now trying the local installer). 10.1 seems to accept MSVS2019 (I need to check as well). If the local install also fails, I will revert to previous CUDA install, which will cause also to install prior version of MSVC.

Thanks to you I have a clear path to go, many thanks for that.