I am new to CUDA and so would really appreciate if someone could help me with this.

I need to implement the FFT in 3d in CUDA. Function below will be called by a Fortan program

extern “C” void tempfft_(int *n1, int *n2, int *n3,cufftComplex *data)
int Nx = *n1;
int Ny = *n2;
int Nz = *n3;

// Allocate device memory for the data
cufftComplex *d_data;
cudaMalloc((void**) &d_data, sizeof(cufftComplex)*Nx*Ny*Nz);

//Copy host memory to device
cudaMemcpy(d_data, data, Nx*Ny*Nz*sizeof(cufftComplex), cudaMemcpyHostToDevice);

//CUFFT plan
cufftHandle plan;
cufftPlan3d(&plan, NX, NY, NZ, CUFFT_C2C);

//FFT Execution
cufftExecC2C(plan, (cufftComplex *)d_data, (cufftComplex *)d_data, CUFFT_FORWARD);

//Copy result to the host
cudaMemcpy(data, d_data, Nx*Ny*Nz*sizeof(cufftComplex), cudaMemcpyDeviceToHost);

//Clear device memory
return data;


My questions are:

a) Is the above code sufficient enough to find the FFT in 3D ? Or global and device needed ?

B) There is a sample simpleCUFFT defined in cuda sdk. It has whole bunch of functions defined like Paddata, Convolve etc etc. Are those functions needed in this program as well ?

Please neglect my ignorance in CUDA.

Thanks a lot.

Your code seems ok but you are not taking care of the different ordering between Fortran (column-major) and C( row-major). You will need to create the plane in reverse order:

cufftPlan3d(&plan, NZ, NY, NX, CUFFT_C2C);

Thanks a lot mfatica.

Really appreciate it.