I’m a total newbie here so this should be easy. Here’s part of my deviceQuery:

Warp size: 32
Maximum number of threads per block: 512
Maximum sizes of each dimension of a block: 512 x 512 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1

My questions are:

What is a Warp?

For the dimensions of the grid, is this the number of threads or number of blocks?

The execution model is as such: there is a grid of blocks of thread warps of threads.

so:

Dimension of Grid = number of blocks

Dimention of Block = number of threads

32 threads = 1 warp

kernel<<<100,320>>>(…); will thus schedule 1 grid of 100 blocks, each having 320 threads (10 warps).

Note that each of these X,Y,Z values in ‘X x Y x Z’ are maximums and the overall X,Y,Z configuration is subject to different constraints which are dependent on your particular GPU architecture number (1.0-1.3) [aka. “Compute Capability”]. From memory, two particularly relevant sections of the Programming Guide are: 5.2 and Appendix A.

Again, this is all explained in the programming guide: