Hello,
I made a really simple kernel, which causes an “invalid configuration argument” (or rather the invoke of the kernel).
__global__ void SimpleKernel (int iLevel) {
int id = threadIdx.x;
}
// Host code
int main (...)
{
int iLevel = 0;
SimpleKernel <<< dim3(4,4,4), dim3(16,1,1) >>> (iLevel);
printf("%s\n",cudaGetErrorString(cudaGetLastError()));
}
Why does my code causes an error? Using “dim3(4,4,1)” instead of the grid “dim3(4,4,4)” works fine.
However, the CUDA manual says there’s just a block-limit of 64 in z-direction - not of 1.
So, anybody can help me out?