What is the Compute Capability of a GeForce GT 710

I cannot find the GeForce GT 710 in the “GeForce and TITAN Products” list at CUDA GPUs - Compute Capability | NVIDIA Developer. That is why I do not know its Compute Capabilty.

A similar question for an older card that was not listed is at What's the Compute Capability of GeForce GT 330. The answer there was probably to search the internet and find it in the CUDA C Programming Guide. In the new CUDA C++ Programming Guide of CUDA Toolkit v11.0.3, there is no such information.

According to the internet, there seem to have been multiple GPU models sold under that name: one had compute capability 2.x and the other had compute capability 3.0. Neither are supported by CUDA 11 which requires compute capability >= 3.5.

How can I find out the Compute Compatibility then of my Asus GeForce GT 710? I can now only hope that it is at least 3.0.

I have support for CUDA Version 11.0 according to nvidia.smi, and I could can install CUDA toolkit 11.0, see nvcc --version.

C:\Users\Admin>nvidia-smi
Sat Aug 15 13:17:47 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 451.82       Driver Version: 451.82       CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name            TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GT 710     WDDM  | 00000000:01:00.0 N/A |                  N/A |
| N/A   58C    P0    N/A /  N/A |    380MiB /  2048MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

C:\Users\Admin>nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Wed_Jul_22_19:09:35_Pacific_Daylight_Time_2020
Cuda compilation tools, release 11.0, V11.0.221
Build cuda_11.0_bu.relgpu_drvr445TC445_37.28845127_0

It does not work, though. Running a pytorch example outputs:

cuda:0
C:\Users\Admin\anaconda3\envs\ml\lib\site-packages\torch\cuda\__init__.py:125: UserWarning: 
GeForce GT 710 with CUDA capability sm_35 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_61 sm_70 sm_75 compute_37.
If you want to use the GeForce GT 710 GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/

Searching for “cudatoolkit supporting sm_35” I get at https://blog.exxactcorp.com/nvidia-cuda-11-now-available/:

2.5. Deprecated and Dropped Features [in CUDA 11]

The following features are deprecated or dropped in the current release of the CUDA software. Deprecated features still work in the current release, but their documentation may have been removed, and they will become officially unsupported in a future release. We recommend that developers employ alternative solutions to these features in their software.

General CUDA

  • Support for Red Hat Enterprise Linux (RHEL) and CentOS 6.x is dropped.
  • Support for Kepler sm_30 and sm_32 architecture based products is dropped.
  • Support for the following compute capabilities are deprecated in the CUDA Toolkit:

That explains why my card could support CUDA Toolkit 11. But that official “CUDA Toolkit” does not help me with Pytorch. There I need the conda binary install “cudatoolkit” which is a dependency of Pytorch.

At the moment, cudatoolkit 10.2 is installed. But that is too much for sm_35. With some luck, a slightly lower version of cudatoolkit will support my card.

The pytorch error message seems unambiguous: pytorch thinks you have a device with compute capability 3.5, but it requires a device with compute capability >= 3.7, so it cannot use this GPU.

If this is indeed a GPU with compute capability 3.5 (a third version of the GT 710?), you should be able to use CUDA 11 without any issues. “Deprecated” means things are fully functional but that NVIDIA will remove support “soon”, usually in the next CUDA version. So any feature deprecated in version N will likely disappear in version N+1.

Unless things changed, there should be a deviceQuery executable in the demo suite of your CUDA installation. On Windows it should be in C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\vX.Y\extras\demo_suite, where X.Y specifies your CUDA version, e.g. 10.2. deviceQuery will tell you the compute architecture of your device: CUDA Capability Major/Minor version number.

This is all pretty straight forward except for the frustrating part that marketing folks at NVIDIA repeatedly assigned the same product name to parts using different chips (from different architectures even). That’s just bad with a capital ‘B’.

shows that this card has Cuda 3.5.

I could not get Pytorch to run. I have tried a lot of previous installs, on a test script, I always get the errors like “RuntimeError: CUDA error: no kernel image is available for execution on the device”, the older ones gave other errors but were also not working.

How do I find the best Pytorch install command for CUDA 3.5 (Cuda Compatibity)?
I tried various combinations from https://pytorch.org/get-started/previous-versions/:
conda uninstall pytorch torchvision cudatoolkit=10.2 -c pytorch
conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit=9.2 -c pytorch # does not work
conda install pytorch==1.5.1 torchvision==0.6.1 cudatoolkit=10.1 -c pytorch
conda uninstall pytorch==1.5.1 torchvision==0.6.1 cudatoolkit=10.1 -c pytorch
Wheel
pip install torch==1.5.1+cu92 torchvision==0.6.1+cu92 -f https://download.pytorch.org/whl/torch_stable.html
pip uninstall torch
pip uninstall torch
conda install pytorch==1.0.0 torchvision==0.2.1 cuda80 -c pytorch # does not work
conda install pytorch==1.0.1 torchvision==0.2.2 cudatoolkit=10.0 -c pytorch

TechPowerUp also has this:

It is better to run the device query app than rely on internet databases or Wikipedia. I don’t know anything about pytorch (not an NVIDIA product best I know). So I cannot tell you which pytorch versions work with compute capability 3.5. I would suggest asking on a forum or mailing list dedicated to pytorch, or contacting whoever provides this software directly.

This Github thread seems to indicate that issues with compute capbility 3.5 could be due to the way Magma is configured:

https://github.com/pytorch/pytorch/issues/32759

The device query app is part of the CUDA install. I have installed CUDA v11.0 (In v.8.0, that is not yet available), and it confirms CUDA Capability Major/Minor version number 3.5, thanks for the hint.

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.0\extras\demo_suite>deviceQuery.exe
deviceQuery.exe Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GeForce GT 710"
  CUDA Driver Version / Runtime Version          11.0 / 11.0
  CUDA Capability Major/Minor version number:    3.5
  Total amount of global memory:                 2048 MBytes (2147483648 bytes)
  ( 1) Multiprocessors, (192) CUDA Cores/MP:     192 CUDA Cores
  GPU Max Clock rate:                            954 MHz (0.95 GHz)
  Memory Clock rate:                             2505 Mhz
  Memory Bus Width:                              64-bit
  L2 Cache Size:                                 524288 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
  Maximum Layered 1D Texture Size, (num) layers  1D=(16384), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(16384, 16384), 2048 layers
  Total amount of constant memory:               zu bytes
  Total amount of shared memory per block:       zu bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          zu bytes
  Texture alignment:                             zu bytes
  Concurrent copy and kernel execution:          Yes with 1 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  CUDA Device Driver Mode (TCC or WDDM):         WDDM (Windows Display Driver Model)
  Device supports Unified Addressing (UVA):      Yes
  Device supports Compute Preemption:            No
  Supports Cooperative Kernel Launch:            No
  Supports MultiDevice Co-op Kernel Launch:      No
  Device PCI Domain ID / Bus ID / location ID:   0 / 1 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 11.0, CUDA Runtime Version = 11.0, NumDevs = 1, Device0 = GeForce GT 710
Result = PASS

(post withdrawn by author, will be automatically deleted in 24 hours unless flagged)

“RuntimeError: CUDA error: no kernel image is available for execution on the device”, the older ones gave other errors but were also not working.
This error refers to CUDA toolkit required to be installed,for ml tools to use CUDA capabilities.
you can follow Software Requirements section, on this link 使用 pip 安装 TensorFlow