Grid and Thread size problem...

I do a simple test :

global void test(float* diff)
diff[0] = 3.33;


dim3 grid(4,4,D);
dim3 threads(16,16,1);

It works with D=1 but not with D>1 :s. My program is more complicated than this one and actually, this is just a test to debug my program.
Thx for your help,

The grid of blocks could be only 2D (as written in the manual).
A block could be 3D.

Hum… Ok I see! But, In my test, when I try

dim3 grid(4,4,1);
dim3 threads(16,16,D);

with D>1, it doesn’t work… Strange no?

Pb fixed!

When I use blocks of size 16x16x1 and 16x16x2 it works.
But with 16x16x3…, it doesn’t works because I have more than 512 threads per block!
Thx for your help