Matlab FFT vs CUDA FFT

Hello,

There are some posts related to the discrepancies between FFT’s performed with Matlab or CUDA that I found interesting:

https://devtalk.nvidia.com/default/topic/476089/cuda-programming-and-performance/cuda-fft-different-from-matlab-fft/

https://devtalk.nvidia.com/default/topic/472061/cuda-fft-vs-matlab-fft-fft/

I have had similar issues comparing results from the two, although I have found a solution that does not make much sense to me.

I understand that problems arise from single vs double precision, the column-major vs row-major format.

For comparing simple matrices with NxN dimensions I see fairly similar results. With NxM matrices where N != M I witness very different results until changing

cufftPlan2d(&plan,width,height,CUFFT_Z2Z)

to

cufftPlan2d(&plan,height,width,CUFFT_Z2Z)

which is odd because in the cufft documentation it clearly states

cufftPlan2d(cufftHandle *plan, int nx, int ny, cufftType type);

that the x-dimension comes before the y-dimension.

I believe I am creating my flattened 2D array from an OpenCV image correctly and displaying the results in the row-major format with

for (int i=0;i<height;i++)
	{
		for (int j=0;j<width;j++)
		{
		array[width*i+j] = image.at<float>(i,j);
		}



	}

for (int i=0;i<16;i++) 
	{
		for (int j=0;j<16;j++)
		{
		std::cout<<array[width*i+j]<<" ";
		}
		std::cout<<" "<<std::endl;
	}

so I do not understand why these discrepancies take place. If there is any insight on why this may occur I’d like to know. The previous suggestions in past posts did not seem to resolve much.

may (or may not) be of interest:

https://stackoverflow.com/questions/44436155/cuda-fft-2d-different-results-from-matlab-fft-on-2d