Using cufftDx to calculate FFTs on matrix lines


I want to to execute FFT on every line of a matrix (MxN), using cufftDx library, But I’m not sure how to implement it.
Is the following idea will do the work?

  1. Define the description of one-line-FFT using the “Description Operators” and use the “Block()” operator.

  2. Define “FFTs Per Block” to be M (the number of lines)

  3. Get the recommended parameters of “elements_per_thread”, “shared_memory_size” and so on.

  4. Use those parameters to execute FFT ,M-times in each thread (so each thread calculate few elements of each line). I’m not sure how to implement this stage at all.

Can someone help?

As long as size N is within cuFFTDx limits, you can use batching.
You’ll have one FFT per row. Start with one FFTs_Per_Block, once you have the correct answer increase and compare performance.
Checkout these examples GitHub - mnicely/cufft_examples: cuFFT and cuFFTDx example