Multi-GPU FFT - CUFFT only supports complex-to-complex?

zjw518 · May 2, 2018, 5:08pm

The CUFFT documentation states that “Only C2C and Z2Z transform types are supported” on multiple GPUs. (Also, only in-place transforms.) For heavy use of complex-to-real and real-to-complex transforms, one therefore has to choose between

Copying data as needed to a complex array and using CUFFT’s multi-GPU routines (and accounting for the permuted order of the results).
Writing one’s own routine which, e.g., slab-decomposes a 3D array across GPUs, manually uses CUFFT’s single-GPU 2D and 1D transforms, transposes the data, and performs the remaining dimension(s) of the transform (and possibly redistributes the data).

The first case is much simpler to program, but I am wary of the cost of memory copying. The second option is likely identical to what CUFFT performs in the C2C case, just implementing R2C or C2R transforms instead. I would imagine CUFFT’s implementation of these steps is faster than anything I could write.

My question is - is it at all worthwhile to attempt 2) to attempt to mitigate the memory copy cost (and relative inflexibility) of 1)? Any other experience or insight would be greatly appreciated. (For concreteness, I am writing a code which, in the current single-GPU version, has is ~50% FFT by runtime. I also plan to eventually scale to an MPI (multi-node) implementation, each node with one or multiple GPUs.)

Topic		Replies	Views
Newbie to cuFFT - how to do real-to-real transforms GPU-Accelerated Libraries	5	1972	February 19, 2019
Using CUFFT library difference between using R2C-C2R or C2C CUDA Programming and Performance	2	9744	March 11, 2011
2D FFT with CUDA FFT Library CUDA Programming and Performance	3	10026	October 18, 2011
Real FFT Wish list: real FFT support CUDA Programming and Performance	0	5110	February 26, 2007
CUDA FFT of reals in 2D CUDA Programming and Performance	5	14946	June 12, 2007
The multi-gpu fft 3D R2C problem GPU-Accelerated Libraries	5	658	September 9, 2019
CUFFT on two GPUs (bug) CUDA Programming and Performance	1	786	May 11, 2017
CUFFT: calculation time CUDA Programming and Performance	6	2669	April 21, 2012
CUFFT on multiple GPUs CUDA Programming and Performance	6	6246	February 15, 2012
CUFFT on multiple cards ? CUDA Programming and Performance	1	2626	April 22, 2010

Multi-GPU FFT - CUFFT only supports complex-to-complex?

Related topics