I want to use the FFTW Interface to cuFFT to run my Fourier transforms on GPUs. I don’t want to use cuFFT directly, because it does not seem to support 4-dimensional transforms at the moment, and I need those.
However, the documentation on the interface is not totally clear to me. My original FFTW program runs fine if I just switch to including cufftw.h rather than fftw3.h (so I’m not using FFTW directly), but does that mean that it still runs on the CPU, simply using the interface?
I am using fftw_plan_dft() and fftw_execute() functions.
How do I ensure that the calculations are actually done on the GPU? I tried using cudaMallocManaged() to allocate memory for my data, but I’m not sure that means that the calculations are done on the GPU as well. An example using FFTW interface to runs something on a GPU would be most helpful
For your peace of mind, don’t link against FFTW and remove anything related to it from your include paths in compilation.
Then, it will be running on GPU. These functions, fftw_plan_dft(), fftw_execute() et al, are just wrappers to cuFFT functions. Which I highly recommend you use as soon as you can.
It is just a matter of:
- Alocate your arrays of type cufftReal and cufftComplex
- Create a cufftHandle for your forward and one for your backward transform (if needed)
- Create plans with cufftPlan1d or other type you need (check the cuFFT documentation)
- Run the various transforms available: cufftExecR2C, cufftExecC2R, cufftExecC2C…
- Destroy the plan with cufftDestroy when you are finished
- Deallocate the arrays
It is exactly the same steps you already have with FFTW, just different API calls.
Thanks for clearing it up. I no longer include FFTW, so then it must be using GPUs.
These functions, fftw_plan_dft(), fftw_execute() et al, are just wrappers to cuFFT functions. Which I highly recommend you use as soon as you can.
I would, but to my knowledge, cuFFT only does up to 3-dimensional arrays. From the doc: “cuFFT supports one-dimensional, two-dimensional and three-dimensional transforms”. There is no cufftPlan3d or cufftPlanNd, and I currently need 4-dimensional transforms, possibly higher in the future. I guess it would be possible to implement it as a series of 1D transforms or something like that, but that’s a lot of headache - and seems to readily work via the FFTW interface - which actually begs the question - how is the interface implemented, so that a 4D transform works via the interface, but is not directly available in cuFFTW?
Anyway, option to do higher dimensions directly with cuFFT would be appreciated.
You could file an enhancement request with NVIDIA for that. These can be filed using the normal bug reporting form on the developer website and prefixing the synopsis with “RFE:” to mark it as an enhancement request rather than a report for a functional bug.
And since it can take some time for it to be implemented (we don’t know their priorities), you may want to try Intel MKL, which has multidimensional FFT as you need, and is faster than FFTW.
Whatever code I had with FFTW was replaced by MKL. Yes, there are also wrappers for FFTW calls.