The 3rd dimension can't be greater than 1?

I have a three-fold cycle in c++, and I want to run it with GPU, so I write such code:

void func()

{

int i=100,j=100,k=20;

dim3 dimBlock(16,16,1);

dim3 dimGrid(i/dimBlock.x+1, j/dimBlock.y+1, k/dimBlock.z+1);

mykernel<<<dimGrid,dimBlock>>>();

}

But I always got such error:

cufft: ERROR: /root/cuda-stuff/sw/rel/gpgpu/toolkit/r2.2/cufft/src/execute.cu, line 1070

cufft: ERROR: CUFFT_EXEC_FAILED

cufft: ERROR: /root/cuda-stuff/sw/rel/gpgpu/toolkit/r2.2/cufft/src/execute.cu, line 316

cufft: ERROR: CUFFT_EXEC_FAILED

cufft: ERROR: /root/cuda-stuff/sw/rel/gpgpu/toolkit/r2.2/cufft/src/cufft.cu, line 147

cufft: ERROR: CUFFT_EXEC_FAILED

i didn’t run cufft in above funcition func(). Why I got an error of cufft?

But when I set the 3rd number of dimGrid to 1, that is, dim3 dimGrid(i/dimBlock.x+1, j/dimBlock.y+1, 1), the error disappear!

Btw, I use the GTX295 and cuda2.2

And I also didn’t find the folder and file shown in the error: cuda-stuff/sw/rel/gpgpu/toolkit/r2.2/cufft/src/execute.cu

thank you.

Grids can be 2D, Blocks can be 3D

From the Programming Guide (section 4.2.3):

I see

Thank you.

A supplemental question… is this limitation going to be relaxed in the future? There are times when being able to launch a 3D grid would be quite nice (and for the far future, I could express the kernel launch for my current project much more easily with a six dimensional grid :D )

I agree with you.

I also hope Nvdia can add the support for multi-dimension grid to cuda.

because in most of case of physics we do the computation in a space which has more than three dimensions.

If nvidia want sale the GPU in scientific filed, I think it is very necessary.

I’m wondering why there’s misslead in documentation:

CUDA C Programming Guide Version 4.0, page 8:

3D grids are supported in CUDA 4.0 on Fermi GPUs.

Just now I ran into issues with this as well. It seems that you need both a device with compute ability 2.0 and the CUDA runtime 4.0 to use the third dimension in grids. If I run on a Tesla C2050 (with compute ability 2.0) and CUDA 4.0 it works, but with a GeForce GTX 460 (with compure ability 2.1) and CUDA 3.2 it doesn’t. Maybe I missed something.

Correct, you need compute capability >= 2.0 and CUDA version >= 4.0 to use three-dimensional grids.