Dimensionality of grid of thread blocks questions

Skybuck · February 10, 2015, 12:28am

I have some questions about kernels written for compute 1.x. The compute capability specification mentions only 2 dimensions allowed for “grid of thread blocks”.

The rest of the documentation mentions: “Y or Z dimensions limitations”.

So my question is the following:

Is it possible to launch any mix of X,Y,Z combinations on compute capability 1.x devices ?

So for example:

X=5,Y=6,Z=1 (basically X,Y)
X=1,Y=8,Z=9 (basically Y,Z)
X=10,Y=1,Z=10 (basically X,Z)

Assumption is that the other coordinates must at least be 1.

So the way I interpret the documentation is that compute capability 1.x kernels/devices can only be launched with 2D “grid of threads” launch parameters ??? Is this true ? Or am I miss interpreting the documentation ?

So I interpet the “Y or Z” text as a kind of “switch”. The kernel writter needs to make a choice which dimensions he/she wants to use ? Not all 3 are available, only 2 ?

What about actual kernel code ? Can it only use 2 dimensions ? Would using all 3 dimensions cause a compile fail ?

My questions are mostly related to the cuda driver api and how to launch compute 1.x kernels successfully (and later questions how to perhaps writer those older kernels some day).

Skybuck · February 10, 2015, 12:30am

Also related to how to compute optimal launch parameters for compute 1.x kernels… I am trying to figure out if some dimensions must be skipped.