I have some accuracy problem using CUDA FFT compared with FFTW3F.
The GPU is RTX3080, CUDA and NVCC version 11.1.
I create a Eigen::Matrix with row/column are 2048. The difference between CUDA/FFTW3F larger than 1e-3.
what I expect is less than 1e-5.
real max coeff: 0.00195312
real min coeff: -0.00183105
imag max coeff: 0.00195312
imag min coeff: -0.00195312
The test code attached.fft_cmp.tar.gz (3.3 MB)