Hi, I’m experimenting with implementing some basic DSP filtering with CUDA. I finished my 1D direct FFT filter and am now trying to filter a 2D matrix row by row but faster then just doing them sequentially in 1D arrays row by row.
I mostly read to do this with cufftPlanMany instead of cufftPlan1D with batches but am struggling to figure out how I can properly set the length of my FFT.
As I’m doing DSP filtering I want to do an FFT of my impulse response (filter) and my signal. With the length of the FFT being chosen by finding the next greater power of 2 of (signalLength+irLength-1). Doing this in 1D with cufftPlan1D allowed me to set the size of the FFT with the ‘nx’ argument.
cufftPlan1d(&plan, fftLength, CUFFT_R2C, 1));
But given that now in 2D my signal matrix is a of size signalLength*rows, where can I tell cuFFT that it needs to pad each row that it uses as input for the FFT so that it becomes of my chosen length?
Cause if I use:
cufftPlanMany(&plan, 1, {fftLength}, inembed, istride, idist, onembed, ostride, odist, CUFFT_R2C, rows)
How it will know the length of each signal(row) ? I’m guessing I would need to set ‘idist’ argument for that but I’m struggling to figure out what all the other arguments should be then. My best guesses would be:
int rank = 1;
int n[] = {fftLength};
int inembed[] = {0};
int istride = 1;
int idist = inputLength;
int onembed[] = {0};
int ostride = 1;
int odist = (fftLength/2) + 1;
int batch = rows;
cufftPlanMany(&forwardPlanInput, rank, n, inembed, istride, idist, onembed, ostride, odist, CUFFT_R2C, batch);
But I very much doubt it is correct as the results are only correct for the first row.