Cuda 10.2 on 2080Ti - no CUDA-capable device is detected

Hi,

I try to have a working Windows 10 Installation of the “CUDA Toolkit 10.2.89” with my “RTX 2080Ti” graphics card.

I understand that for only consuming CUDA runtime services within other 3rd party apps like PyTorch, I only would need the Win 10 Nvidia driver, as PyTorch brings its own CUDA runtime.

Unfortunately that did not work with PyTorch, even when I have a PyTorch version installed with CU102:
python -c “from torch.utils.cpp_extension import CUDA_HOME; print(CUDA_HOME)” → No CUDA runtime is found, using CUDA_HOME=‘C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2’
python -c “import torch; print(torch.version)” → 1.10.0+cu102
python -c “import torch; print(torch.torch.path)” → [‘C:\Users\Roman\miniconda3\envs\TT\lib\site-packages\torch’]
python -c “import torch; print(torch.version.cuda)” → 10.2
python -c “import torch; print(torch.cuda.is_available())” → False (expect True)
python -c “import torch; print(torch.cuda.device_count())” → 0 (expect 1)
python -c “import torch; print(torch.cuda.get_device_name(0))” → RuntimeError: No CUDA GPUs are available (expect “NVidia 2080Ti”)
python -c “import torch; torch.cuda.set_device(0)” → RuntimeError: No CUDA GPUs are available
python -c “import torch; print(torch.cuda.current_device())” → RuntimeError: No CUDA GPUs are available (expect 0)
python -c “import torch; print(torch.zeros(1).cuda())” → RuntimeError: No CUDA GPUs are available

So PyTorch can not find my CUDA GPU with the error “No CUDA GPUs are available”.

I have tried that with many different Nvidia drivers with all the same results. Currently I use the latest Nvidia driver 546.29.

So I then tried to focus on the CUDA Toolkit only, and according to CUDA - Wikipedia I installed the CUDA Toolkit 10.2.89. This was a 3-part installation of “cuda_10.2.89_win10_network.exe”, “cuda_10.2.1_win10.exe” and “cuda_10.2.2_win10.exe”.
I also installed the recommended “cuDNN 8.7.0 for CUDA 10.2” according to Support Matrix - NVIDIA Docs as “cudnn-windows-x86_64-8.7.0.84_cuda10-archive.zip” and extracted the "bin, “include”, “lib\x64” files into the corresponding “CUDA\v10.2” subfolders “bin”, “include” and “lib\x64”.

The RTX 2080Ti is a “Turing” card with a Compute Capability of “7.5” and should have support for CUDA Toolkit SDK 10.0 - 10.2.

Acording to 1. CUDA 12.3 Update 1 Release Notes — Release Notes 12.3 documentation the minimal “Windows x86_64 Driver Version” for CUDA Toolkit “CUDA 10.2.89” must be >= 441.22
I have installed the latest Desktop Win10 Nvidia driver 546.29, which is also shown in the output of “nvidia-smi.exe”:
C:>nvidia-smi.exe
Wed Dec 13 20:32:02 2023
±--------------------------------------------------------------------------------------+
| NVIDIA-SMI 546.29 Driver Version: 546.29 CUDA Version: 12.3 |
|-----------------------------------------±---------------------±---------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 2080 Ti WDDM | 00000000:0E:00.0 On | N/A |
| 0% 47C P8 26W / 300W | 2280MiB / 11264MiB | 0% Default |
| | | N/A |

The installed CUDA Toolkit is reported as V10.2.89:
C:>nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Wed_Oct_23_19:32:27_Pacific_Daylight_Time_2019
Cuda compilation tools, release 10.2, V10.2.89

When I try to run “deviceQuery.exe” I also get a “no CUDA-capable device is detected” error:
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\extras\demo_suite>deviceQuery.exe
deviceQuery.exe Starting…

CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 100
→ no CUDA-capable device is detected
Result = FAIL

I also checked within “NVIDIA Control Panel” → “Manage 3D Settings” → “Global Settings” that the Setting “CUDA - GPUs” is set to “All” and the checkbox for “NVIDIA GeForce RTX 2080 Ti” is checked.
I only have this ONE graphics card in this PC, and there are no IGPUs from the processor, which is a “AMD Ryzen 7 3700X 8-Core Processor”.
Windows 10 Pro 22H2 19045.3803, Windows Feature Experience Pack 1000.19053.1000.0

Environment Variables (only list the variables relevant for this problem):
C:\SET
CommonProgramFiles=C:\Program Files\Common Files
CommonProgramFiles(x86)=C:\Program Files (x86)\Common Files
CommonProgramW6432=C:\Program Files\Common Files
CUDA_PATH=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2
CUDA_PATH_V10_2=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2
CUDNN_PATH=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2
DriverData=C:\Windows\System32\Drivers\DriverData
HOMEDRIVE=C:
HOMEPATH=\Users\Roman
LD_LIBRARY_PATH=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\lib\x64
LOCALAPPDATA=C:\Users\Roman\AppData\Local
NUMBER_OF_PROCESSORS=16
NVCUDASAMPLES10_2_ROOT=C:\ProgramData\NVIDIA Corporation\CUDA Samples\v10.2
NVCUDASAMPLES_ROOT=C:\ProgramData\NVIDIA Corporation\CUDA Samples\v10.2
NVTOOLSEXT_PATH=C:\Program Files\NVIDIA Corporation\NvToolsExt
OS=Windows_NT
Path=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\bin;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\lib\x64;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\include;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\libnvvp;c:\windows\system32;c:\windows;c:\windows\system32\wbem;c:\windows\system32\windowspowershell\v1.0;c:\windows\system32\openssh;c:\windows\system32;c:\program files\dotnet;C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common;C:\Program Files\Git\cmd;C:\Program Files\NVIDIA Corporation\Nsight Compute 2019.5.0;C:\Users\Roman\AppData\Local\Programs\Python\Python39\Scripts;C:\Users\Roman\AppData\Local\Programs\Python\Python39;C:\Users\Roman\AppData\Local\Microsoft\WindowsApps;C:\Users\Roman\AppData\Local\Programs\Microsoft VS Code\bin
PATHEXT=.COM;.EXE;.BAT;.CMD;.VBS;.VBE;.JS;.JSE;.WSF;.WSH;.MSC
PROCESSOR_ARCHITECTURE=AMD64
PROCESSOR_IDENTIFIER=AMD64 Family 23 Model 113 Stepping 0, AuthenticAMD
PROCESSOR_LEVEL=23
PROCESSOR_REVISION=7100
ProgramData=C:\ProgramData
ProgramFiles=C:\Program Files
ProgramFiles(x86)=C:\Program Files (x86)
ProgramW6432=C:\Program Files
SystemDrive=C:
SystemRoot=C:\WINDOWS
TEMP=C:\Users\Roman\AppData\Local\Temp
TMP=C:\Users\Roman\AppData\Local\Temp
windir=C:\WINDOWS

I really run out of ideas what can be wrong with my setup?
My problem seems NOT to be related to Python or PyTorch, as also from a pure “NVIDIA Developer view” it does not work.
What else I can check?

Greetings,
Roman (from Vienna)

Basically with a recent Nvidia driver I should be able to run any PyTorch+cuXXX binary with my 2080Ti.
Unfortunately I still face the error “RuntimeError: No CUDA GPUs are available”, regardless what I try.
And I am quite sure this issues NOT raises from Python/PyTorch but from my Win10/Nvidia Setup.

I already installed several Nvidia drivers (clean install), always with removing any previous driver with DDU. I also checked there were no stale contents in the registry or within the “C:\Windows\System32\DriverStore\FileRepository” directory. I tried also different Cuda Toolkits to run the “deviceQuery.exe” sample to check CUDA and always fail with:

cudaGetDeviceCount returned 100
→ no CUDA-capable device is detected
Result = FAIL

My Computer has no internal iGPU processor, it is a Ryzen 7 3700X which has no internal GPU, so the 2080Ti is my only GPU!

There are no multiple "NVCUDA64.dll"s, only one in the corresponding “C:\Windows\System32\DriverStore\FileRepository\nv_dispig.inf_amd64_xxxxxxxx” subdirectory!

There are no multiple "nvcuda.dll"s, only one in the “C:\WINDOWS\system32” directory!

There are no EventLog messages in EventViewer!

Do you have any ideas what else could be the problem for this error?
How I can debug this problem further?
Are there Traces I could enable?

Thanks,
Roman

In the meantime I did setup another clean Windows 10 Pro 22H2 for dual boot on the same machine into an empty partition. Sure enough Windows Update automatically installed an NVIDIA Display Driver via the Windows 3rd-party Driver store, and everything with regards to CUDA worked fine. I checked it the easy way with “GPU-Z->Advanced Settings->Cuda”. Also the following “Conda/Python/PyTorch+Cuda12.1” installation worked as expected!

So that means my existing “old” Windows 10 22H2 has some driver/registry issues, which prevent the CUDART (nvcuda.dll/nvcuda64.dll) to find my 2080Ti card as CUDA device.

I already booted into “Safe Mode” (without Internet) and uninstalled all NVIDIA drivers with the latest version of DDU v18.0.7.0. Windows first gave the 2080Ti a Windows generic base driver. Then I rebooted back into “normal Boot” mode with Internet enabled again, and Windows update installed a default NVIDIA driver 536.23 (31.0.15.3623 dated 2023-06-08) from the windows update store (same as it did on the clean Windows Installation).

But even with this driver CUDART can NOT detect my 2080Ti.

So the problem lies within my Windows driver/registry config, which somehow still blocks CUDA from finding my GPU.

Are there any NVIDIA “cleaning tools” to clean the registry and “C:\Windows\System32\DriverStore\FileRepository\nv*” folders and in “C:\Windows\system32”, “C:\WINDOWS\system32\lxss\lib”, “C:\WINDOWS\SysWoW64”?

I would very much appreciate any help from some NVIDIA Support personal or any other expert in this field ;-)

Thanks, Roman

I am neither : ) Nvidia do offer some advice here.