I am new to CUDA and so would really appreciate if someone could help me with this.
I need to implement the FFT in 3d in CUDA. Function below will be called by a Fortan program
extern “C” void tempfft_(int *n1, int *n2, int *n3,cufftComplex *data)
int Nx = *n1;
int Ny = *n2;
int Nz = *n3;
// Allocate device memory for the data cufftComplex *d_data; cudaMalloc((void**) &d_data, sizeof(cufftComplex)*Nx*Ny*Nz); //Copy host memory to device cudaMemcpy(d_data, data, Nx*Ny*Nz*sizeof(cufftComplex), cudaMemcpyHostToDevice); //CUFFT plan cufftHandle plan; cufftPlan3d(&plan, NX, NY, NZ, CUFFT_C2C); //FFT Execution cufftExecC2C(plan, (cufftComplex *)d_data, (cufftComplex *)d_data, CUFFT_FORWARD); //Copy result to the host cudaMemcpy(data, d_data, Nx*Ny*Nz*sizeof(cufftComplex), cudaMemcpyDeviceToHost); //Clear device memory cufftDestroy(plan); cudaFree(d_data); return data;
My questions are:
a) Is the above code sufficient enough to find the FFT in 3D ? Or global and device needed ?
B) There is a sample simpleCUFFT defined in cuda sdk. It has whole bunch of functions defined like Paddata, Convolve etc etc. Are those functions needed in this program as well ?
Please neglect my ignorance in CUDA.
Thanks a lot.