Question about cuFFT library

Hi,

for years i’ve been using cuFFT to speed-up my signal processing application, and as I always did multiple contiguous 1D FFTs, cufftPlan1D totally fulfilled my needs.

Now I need to do something a bit more tricky. My data are stored in a 3D matrix of size 512x512x16, and I need to perfrom :

  • 512x16 contiguous FFTs of size 512 in the first dimension => I can use cufftPlan1d like I always did for that

  • tricky part : 512x16x32 non-contiguous (stride 32) FFTs of size 16 along the second dimension but stored in a contiguous way (see below).

In the cuFFT documentation, there’s cufftPlanMany which some of the input parameters are far from being crystal clear for me.

So my questions are :

Can I perform these 2 operations using a single 2D-FFT plan ? (i doubt about it)

At least, can I use cufftPlan1d then cufftPlanMany to do this ? If yes, I suppose I won’t be able to batch all my FFTs accross dimension 2 because the batch would be on dimension 1 and dimension 3. Should I batch on dimension 1 and use streams for the loop on dimension 3 ?

Thanks.

Jeff

In the cuFFT documentation, there’s cufftPlanMany which some of the input parameters are far from being crystal clear for me.

Inputs are consistent with FFTW Advanced Complex DFTs (FFTW 3.3.10)

Can I perform these 2 operations using a single 2D-FFT plan ? (i doubt about it)

No

IIUC, it looks like your sizes are consistent; therefore you should be able to create cufftPlanMany and launch chunks in multiple streams to allow as much overlap as possible and saturate the GPU.

In the future, it’s much easier to help if you can provide a small reproducer.

Thanks, I’ll share my code in case others might be interested.