CUDA device not detected; worked previously

My GPU is no longer detected by CUDA, even though it was previously working with the same hardware setup. When it stopped, I don’t believe it was due to a driver update, but I am not entirely sure.

I have tried a few older driver versions and installed the CUDA toolkit version 11.7.1, along with the driver it came with. None of those solved the problem.

System info:
Windows 10 64
NVIDIA GeForce GTX 980 Ti
Driver version 516.94
“CUDA - GPUs” is “All” and if I extend that, it lists just “NVIDIA GeForce GTX 980 Ti” which is checked

Z:\cuda\117\extras\demo_suite\deviceQuery.exe Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 100
-> no CUDA-capable device is detected
Result = FAIL
Z:\>nvidia-smi
Tue Jan 24 06:23:24 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 516.94       Driver Version: 516.94       CUDA Version: 11.7     |
|-------------------------------+----------------------+----------------------+
| GPU  Name            TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ... WDDM  | 00000000:01:00.0  On |                  N/A |
| 26%   54C    P0    76W / 300W |    646MiB /  6144MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

I know the card is working for 3D and it’s probably working for compute as well, it seems just CUDA is having an issue. I’m open to trying any suggestions. The best idea I have a the moment is to try different driver versions until one works.

I’ve solved this.

I have no idea how, but I had a bad environment variable CUDA_VISIBLE_DEVICES=2, 3. The correct value is 0. Simply removing that environment variable fixed the issue.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.