device organization

martman · April 5, 2008, 11:47pm

Hello.

Im not sure if I full get the organization of a CUDA device yet and I’m trying understand the output of deviceQuery.

Does this mean I can have 65535 blocks total on one big grid with 512 threads running in each block? Does this also mean that at any point I actually have 32 threads running at once?

Thanks

There is 1 device supporting CUDA

Device 0: “GeForce 8800 GTX”

Major revision number: 1

Minor revision number: 0

Total amount of global memory: 804978688 bytes

Total amount of constant memory: 65536 bytes

Total amount of shared memory per block: 16384 bytes

Total number of registers available per block: 8192

Warp size: 32

Maximum number of threads per block: 512

Maximum sizes of each dimension of a block: 512 x 512 x 64

Maximum sizes of each dimension of a grid: 65535 x 65535 x 1

Maximum memory pitch: 262144 bytes

Texture alignment: 256 bytes

Clock rate: 1350000 kilohertz

Test PASSED

Press ENTER to exit…

MisterAnderson42 · April 6, 2008, 12:32am

Well, you can run up to 65535 * 65535 blocks in a single call, which is a lot. And 32 is just the warp size. G80 has 16 multiprocessors each of which can keep 24 warps running in an interleaved fashion => the device is running 162632 threads at once.