Differences between cufft plan2d matlab fft2?

Hi.

I’m having some problems when making a CUDA fft2 implementation for MATLAB. In the MATLAB docs, they say that when inputing m and n along with a matrix, the matrix is zero-padded/truncated so it’s m-by-n large before doing the fft2. My code successfully truncates/pads the matrix, but after running the 2d fft, I get only the first element right, and the other elements in the matrix wrong. Can anyone tell me why this happens? After truncating/padding the matrix, all I do is:

cufftHandle plan;

cufftPlan2d(&plan, M, N, CUFFT_C2C);

cufftExecC2C(plan, dmatrix, dmatrix, CUFFT_FORWARD);

cufftDestroy(plan);

but this gives the wrong result. But if I skipp the truncation/padding, I get the right result. Are there any difference between how matlab and cufft calculates the 2d ffts? I’m really confused now.

EDIT: After some more testing, I see that this happens only if M != N. Do the MATLAB fft transpose the matrix before doing 2d fft, so I have to do that in my code?

Can it depend on the fact that Matlab saves data in column first, and CUDA FFT expects the data to be row first?

This helped. When I read a little closer, I even saw that it stands in the cufft documentation. Stupid me.