I’m a total newbie here so this should be easy. Here’s part of my deviceQuery:
Warp size: 32
Maximum number of threads per block: 512
Maximum sizes of each dimension of a block: 512 x 512 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
My questions are:
What is a Warp?
For the dimensions of the grid, is this the number of threads or number of blocks?
The execution model is as such: there is a grid of blocks of thread warps of threads.
so:
Dimension of Grid = number of blocks
Dimention of Block = number of threads
32 threads = 1 warp
kernel<<<100,320>>>(…); will thus schedule 1 grid of 100 blocks, each having 320 threads (10 warps).
Note that each of these X,Y,Z values in ‘X x Y x Z’ are maximums and the overall X,Y,Z configuration is subject to different constraints which are dependent on your particular GPU architecture number (1.0-1.3) [aka. “Compute Capability”]. From memory, two particularly relevant sections of the Programming Guide are: 5.2 and Appendix A.
Again, this is all explained in the programming guide: