Wrong gridDim ... causes: invalid configuration argument

Hello,

I made a really simple kernel, which causes an “invalid configuration argument” (or rather the invoke of the kernel).

__global__ void SimpleKernel (int iLevel) {

	int id = threadIdx.x;

}

// Host code 

int main (...)

{   

	int iLevel = 0;

	SimpleKernel <<< dim3(4,4,4), dim3(16,1,1) >>> (iLevel);

	printf("%s\n",cudaGetErrorString(cudaGetLastError()));

}

Why does my code causes an error? Using “dim3(4,4,1)” instead of the grid “dim3(4,4,4)” works fine.

However, the CUDA manual says there’s just a block-limit of 64 in z-direction - not of 1.

So, anybody can help me out?

Grids are restricted to 2D in CUDA at the moment. You can have 3D block indexing, but only 2D grid indexing. It is explained in Appendix A of the program guide.

Thank you very much. I possibly skipped that …

So I’ll “emulate” a 3D-grid with a 2D-grid I guess.