I’ve been playing around with CUDA 2.2 for the last week and, as practice, started replacing Matlab functions (interp2, interpft) with CUDA MEX files. When I first noticed that Matlab’s FFT results were different from CUFFT, I chalked it up to the single vs. double precision issue. However, the differences seemed too great so I downloaded the latest FFTW library and did some comparisons.
Using a complex input vector of 512 records:
- I took the absolute difference from Matlab’s FFT result and plotted for FFTW-DP, FFTW-SP, CUDA
- I did the FFT followed by the IFFT (with appropriate scaling) and compared to the original data.
Both plots are attached to this post.
I found that CUFFT results were quite a bit “worse” than the FFTW-SP results… by a mean factor of 4-5 times across the 512 records. Note that when I input a vector with 600 complex samples, the CUFFT results were about 8 times worse than FFTW-SP (due to size not being a nice factor of 2,3,5).
I guess my question is: is this type of error expected between two single-precision methods? I am looking to implement CUFFT in an application that is sensitive to signal phase and was hoping for better.
I’d appreciate any comments that could help me better understand why this difference exists. Thanks!