Understanding cuFFT Data Layout


I’m currently attempting to perform a data rotation during an FFT and I wanted to make sure I understood the parameters to cufftPlanMany(). In the past (especially for 1-D FFTs) I’ve used the simpler cufftPlan1/2/3d() calls.

For a batched 1-D transform, cufftPlan1d() is effectively the same as calling cufftPlanMany() with idist=odist=transform_size and istride=ostride=1, correct? Say I want to do two transforms (let’s call them A and B) each of length N. This means that memory must be laid out as A0, A1, A2, … A(N-1), B0, B1, B2, … B(N-1), correct? That is, the expected memory layout is such that as I iterate over the buffer, samples from the first transform are next to one another followed by the second transform.

Say I wanted to now rotate the data in memory (I don’t want to change my input layout, but I do want to change my output layout). I now want a data layout after my FFT to be such that batches are interleaved. So, to borrow from my example, I want data to be laid out as A0, B0, A1, B1, A2, B2, … A(N-1), B(N-1). Can I do this using cufftPlanMany()? Namely, I’m thinking that ostride would be equal to my batch size and odist would be 1.

Is that correct? Is there anything missing from my understanding?

Thanks in advance for the help.

not saying I follow your data transformations

If you can do what you want with FFTW, you can generally do it with CUFFT also.

Sorry, let me try to explain…

So normally, using a batched cufftPlan1d() with a FFT size of N, effectively the data is laid out where each transform’s (of length N) samples are group together next to one another in memory. Said differently, instead of passing in batch size, I can loop batch size times and execute a single FFT each time. I would then move the pointer in my input buffer N * sizeof(sample_type) for each execution of the loop.

What I want now instead is to have the resulting output data be grouped totally differently. I want batches to be right next to one another in memory. That is, the first index in my buffer should contain the first sample from batch 0, the second index (which would normally hold the second sample from batch 0) needs to now have the first sample from batch 1. I think I can accomplish this with odist = 1 and ostride = batch size, but I wanted to make sure I am understanding these parameters correctly.

Yes, you can interleave output data that way using the advanced data layout parameters