Global, shared memory, latency - GPU list

Gunaka · April 7, 2013, 1:04pm

Hi

I am trying to make list of all Nvidia CUDA GPU-s with almost all specifications.
On Nvidia official site I have found graphic cards list ([url]https://developer.nvidia.com/cuda-gpus[/url]) but there is no global memory, shared memory and latency in specification tables.
Is there any other web site or document where I can find these specifications for almost every Nvidia graphic card ?

I would appreciate any help

seibert · April 7, 2013, 8:49pm

For hardware specs, I like the Wikipedia page:

[url]http://en.wikipedia.org/wiki/Comparison_of_Nvidia_graphics_processing_units[/url]

It doesn’t give the compute capabilities of each card directly, although one can deduce it from the “Code Name” column with some outside information about the capabilities of each chip.

In terms of other specifications, this table in the CUDA C Programming Guide is also useful:

[url]http://docs.nvidia.com/cuda/cuda-c-programming-guide/#compute-capabilities[/url]

Other aspects of the architecture, like memory latencies, are generally not officially reported and have to be deduced with microbenchmarks. I know of no comprehensive resource for such information.

Gunaka · April 9, 2013, 7:58am

Thank you 4 your answer

Gunaka · April 12, 2013, 11:44am

is there any way to check or to calculate global memory and shared memory size of a GPU without doing it from code?

MingVonMongo · April 12, 2013, 12:17pm

Do you publish your list on the internet? Guess some more people would like to know all facts about all cards…

Gunaka · April 13, 2013, 3:21pm

Yes. After I finish I will post excel file here on forum.
But i now need some help with global memory. Is there way to calculate it or I must own GPU and run deviceQuery ?

example:
CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: “GeForce GT 640”
CUDA Driver Version / Runtime Version 5.0 / 5.0
CUDA Capability Major/Minor version number: 3.0
Total amount of global memory: 4095 MBytes (4294246400 bytes)
( 2) Multiprocessors x (192) CUDA Cores/MP: 384 CUDA Cores
GPU Clock rate: 902 MHz (0.90 GHz)
Memory Clock rate: 667 Mhz
Memory Bus Width: 128-bit
L2 Cache Size: 262144 bytes
Max Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536,65536), 3D= (4096,4096,4096)
Max Layered Texture Size (dim) x layers 1D=(16384) x 2048, 2D=(16384,16384) x 2048
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Maximum sizes of each dimension of a block: 1024 x 1024 x 64
Maximum sizes of each dimension of a grid: 2147483647 x 65535 x 65535
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 1 copy engine(s)
Run time limit on kernels: No
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Device PCI Bus ID / PCI location ID: 5 / 0
Compute Mode:
< Exclusive Process (many threads in one process is able to use ::cudaSetDevice() with this device)

tera · April 13, 2013, 3:30pm

Even deviceQuery doesn’t give global memory latency, you would need to write your own benchmark for that. For the published theoretical peak bandwidth values see the Wikipedia entry Seibert has pointed you to.

vvolkov · April 13, 2013, 5:47pm

You don’t need to have a device to know how much global memory it has. Just check the specs. Size of the memory is one of the key selling points, e.g. when you see EVGA GeForce GTX 680 2048MB GDDR5 this means you have 2GB of global memory.

Shared memory sizes are listen in Table 10 Technical Specifications per Compute Capability of the CUDA C Programming Guide: [url]CUDA Toolkit Documentation.

Which latency are you interested in? (And what would you do with it if you knew it?)

Gunaka · April 14, 2013, 2:47pm

Are you sure ?
Because look at Device Query of “GeForce GT 640” that I posted and compare it with GeForce RTX 20 Series Graphics Cards and Laptops.

4095 MBytes are not equal 2048 MB

tera · April 14, 2013, 3:41pm

There is a reason why the specs say “Standard Memory Config”: Vendors are free to ship cards with different memory configurations than Nvidia proposed. They are often called "“special edition” by the vendor and usually have more than the standard amount of memory to differentiate in a very uniform market.

vvolkov · April 14, 2013, 6:01pm

That means you have GeForce GT 640 4GB, not GeForce GT 640 2GB. As tera pointed out, different configurations are possible. Either way, you don’t need to have the device to know the number - you can find it elsewhere. Such as on the packaging.

Mati86 · June 23, 2013, 10:28am

In device query of Ge force gt 640…

Maximum sizes of each dimension of a block: 1024 x 1024 x 64

does it mean that i can use a block of size 1024 x 1024 x 64 ???

can anybody answer it?

pasoleatis · June 23, 2013, 12:29pm

No

No. It only means that if you have dim3 threadsperblock, the last component can not be bigger than 64, but you still must have threadsperblock.xthreadsperblock.ythreadsperblock.z<=1024.

Mati86 · June 25, 2013, 3:03pm

thanx for help!!!

Mati86 · June 25, 2013, 5:26pm

In Device Query of Ge Force GT 640 …

Maximum sizes of each dimension of a grid: 2147483647 x 65535 x 65535

does it mean i can launch a grid with size 2147483647 x 65535 x 65535 …

can anyone answer

Topic		Replies	Views
number of gpu's CUDA Programming and Performance	1	1417	May 12, 2009
architecture on gpu CUDA Programming and Performance	2	2741	April 26, 2010
memory size how can i know the size of the different memories? CUDA Programming and Performance	6	6180	November 4, 2009
memory confusion how big is local/shared/global memory? CUDA Programming and Performance	6	3492	January 20, 2009
global memory latency CUDA Programming and Performance	6	6086	December 24, 2008
K20 Global Memory CUDA Setup and Installation	1	4274	February 11, 2013
Odd amount of gloabl memory CUDA Programming and Performance	3	2374	May 7, 2009
Understanding the (unspecified) GPU specifications CUDA Programming and Performance	4	6488	January 14, 2009
Where to find CUDA hardware specification? CUDA Programming and Performance	2	7308	October 1, 2008
Beginner questions on memory spaces CUDA Programming and Performance	2	2555	February 3, 2011

Global, shared memory, latency - GPU list

Related topics