A similar question for an older card that was not listed is at What's the Compute Capability of GeForce GT 330. The answer there was probably to search the internet and find it in the CUDA C Programming Guide. In the new CUDA C++ Programming Guide of CUDA Toolkit v11.0.3, there is no such information.
According to the internet, there seem to have been multiple GPU models sold under that name: one had compute capability 2.x and the other had compute capability 3.0. Neither are supported by CUDA 11 which requires compute capability >= 3.5.
How can I find out the Compute Compatibility then of my Asus GeForce GT 710? I can now only hope that it is at least 3.0.
I have support for CUDA Version 11.0 according to nvidia.smi, and I could can install CUDA toolkit 11.0, see nvcc --version.
C:\Users\Admin>nvidia-smi
Sat Aug 15 13:17:47 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 451.82 Driver Version: 451.82 CUDA Version: 11.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GT 710 WDDM | 00000000:01:00.0 N/A | N/A |
| N/A 58C P0 N/A / N/A | 380MiB / 2048MiB | N/A Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
C:\Users\Admin>nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Wed_Jul_22_19:09:35_Pacific_Daylight_Time_2020
Cuda compilation tools, release 11.0, V11.0.221
Build cuda_11.0_bu.relgpu_drvr445TC445_37.28845127_0
It does not work, though. Running a pytorch example outputs:
cuda:0
C:\Users\Admin\anaconda3\envs\ml\lib\site-packages\torch\cuda\__init__.py:125: UserWarning:
GeForce GT 710 with CUDA capability sm_35 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_61 sm_70 sm_75 compute_37.
If you want to use the GeForce GT 710 GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/
The following features are deprecated or dropped in the current release of the CUDA software. Deprecated features still work in the current release, but their documentation may have been removed, and they will become officially unsupported in a future release. We recommend that developers employ alternative solutions to these features in their software.
General CUDA
Support for Red Hat Enterprise Linux (RHEL) and CentOS 6.x is dropped.
Support for Kepler sm_30 and sm_32 architecture based products is dropped.
Support for the following compute capabilities are deprecated in the CUDA Toolkit:
That explains why my card could support CUDA Toolkit 11. But that official “CUDA Toolkit” does not help me with Pytorch. There I need the conda binary install “cudatoolkit” which is a dependency of Pytorch.
At the moment, cudatoolkit 10.2 is installed. But that is too much for sm_35. With some luck, a slightly lower version of cudatoolkit will support my card.
The pytorch error message seems unambiguous: pytorch thinks you have a device with compute capability 3.5, but it requires a device with compute capability >= 3.7, so it cannot use this GPU.
If this is indeed a GPU with compute capability 3.5 (a third version of the GT 710?), you should be able to use CUDA 11 without any issues. “Deprecated” means things are fully functional but that NVIDIA will remove support “soon”, usually in the next CUDA version. So any feature deprecated in version N will likely disappear in version N+1.
Unless things changed, there should be a deviceQuery executable in the demo suite of your CUDA installation. On Windows it should be in C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\vX.Y\extras\demo_suite, where X.Y specifies your CUDA version, e.g. 10.2. deviceQuery will tell you the compute architecture of your device: CUDA Capability Major/Minor version number.
This is all pretty straight forward except for the frustrating part that marketing folks at NVIDIA repeatedly assigned the same product name to parts using different chips (from different architectures even). That’s just bad with a capital ‘B’.
I could not get Pytorch to run. I have tried a lot of previous installs, on a test script, I always get the errors like “RuntimeError: CUDA error: no kernel image is available for execution on the device”, the older ones gave other errors but were also not working.
How do I find the best Pytorch install command for CUDA 3.5 (Cuda Compatibity)?
I tried various combinations from https://pytorch.org/get-started/previous-versions/:
conda uninstall pytorch torchvision cudatoolkit=10.2 -c pytorch
conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit=9.2 -c pytorch # does not work
conda install pytorch==1.5.1 torchvision==0.6.1 cudatoolkit=10.1 -c pytorch
conda uninstall pytorch==1.5.1 torchvision==0.6.1 cudatoolkit=10.1 -c pytorch
Wheel
pip install torch==1.5.1+cu92 torchvision==0.6.1+cu92 -f https://download.pytorch.org/whl/torch_stable.html
pip uninstall torch
pip uninstall torch
conda install pytorch==1.0.0 torchvision==0.2.1 cuda80 -c pytorch # does not work
conda install pytorch==1.0.1 torchvision==0.2.2 cudatoolkit=10.0 -c pytorch
It is better to run the device query app than rely on internet databases or Wikipedia. I don’t know anything about pytorch (not an NVIDIA product best I know). So I cannot tell you which pytorch versions work with compute capability 3.5. I would suggest asking on a forum or mailing list dedicated to pytorch, or contacting whoever provides this software directly.
This Github thread seems to indicate that issues with compute capbility 3.5 could be due to the way Magma is configured:
The device query app is part of the CUDA install. I have installed CUDA v11.0 (In v.8.0, that is not yet available), and it confirms CUDA Capability Major/Minor version number 3.5, thanks for the hint.
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.0\extras\demo_suite>deviceQuery.exe
deviceQuery.exe Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 1 CUDA Capable device(s)
Device 0: "GeForce GT 710"
CUDA Driver Version / Runtime Version 11.0 / 11.0
CUDA Capability Major/Minor version number: 3.5
Total amount of global memory: 2048 MBytes (2147483648 bytes)
( 1) Multiprocessors, (192) CUDA Cores/MP: 192 CUDA Cores
GPU Max Clock rate: 954 MHz (0.95 GHz)
Memory Clock rate: 2505 Mhz
Memory Bus Width: 64-bit
L2 Cache Size: 524288 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers
Total amount of constant memory: zu bytes
Total amount of shared memory per block: zu bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: zu bytes
Texture alignment: zu bytes
Concurrent copy and kernel execution: Yes with 1 copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
CUDA Device Driver Mode (TCC or WDDM): WDDM (Windows Display Driver Model)
Device supports Unified Addressing (UVA): Yes
Device supports Compute Preemption: No
Supports Cooperative Kernel Launch: No
Supports MultiDevice Co-op Kernel Launch: No
Device PCI Domain ID / Bus ID / location ID: 0 / 1 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 11.0, CUDA Runtime Version = 11.0, NumDevs = 1, Device0 = GeForce GT 710
Result = PASS
“RuntimeError: CUDA error: no kernel image is available for execution on the device”, the older ones gave other errors but were also not working.
This error refers to CUDA toolkit required to be installed,for ml tools to use CUDA capabilities.
you can follow Software Requirements section, on this link 使用 pip 安装 TensorFlow