I have some questions about kernels written for compute 1.x. The compute capability specification mentions only 2 dimensions allowed for “grid of thread blocks”.
The rest of the documentation mentions: “Y or Z dimensions limitations”.
So my question is the following:
Is it possible to launch any mix of X,Y,Z combinations on compute capability 1.x devices ?
So for example:
X=5,Y=6,Z=1 (basically X,Y)
X=1,Y=8,Z=9 (basically Y,Z)
X=10,Y=1,Z=10 (basically X,Z)
Assumption is that the other coordinates must at least be 1.
So the way I interpret the documentation is that compute capability 1.x kernels/devices can only be launched with 2D “grid of threads” launch parameters ??? Is this true ? Or am I miss interpreting the documentation ?
So I interpet the “Y or Z” text as a kind of “switch”. The kernel writter needs to make a choice which dimensions he/she wants to use ? Not all 3 are available, only 2 ?
What about actual kernel code ? Can it only use 2 dimensions ? Would using all 3 dimensions cause a compile fail ?
My questions are mostly related to the cuda driver api and how to launch compute 1.x kernels successfully (and later questions how to perhaps writer those older kernels some day).