Accuracy of CUFFT (regarding Fermi)


For my current project I need to do a lot of FFTs followed by some other calculations. With the CUDA 2.3 FFT and my compute capability 1.3 GTX 285 I cant get the necessary accuracy when doing the FFTs. When comparing with Matlab results, my CUDA results are always +/- 0.00x. Thus, the final results I get in my algorithm are wrong (after some FFTs and multiplications of the results of different FFTs). I narrowed down the problem so I know it is a result of the inaccuracy of the CUFFT and doesnt get better when using double precision.
As far as I read, this is due to the hardware implementation of sin and cos on GPUs. So now im wondering if this has improved on Fermi GPUs, as Im planning on getting a GTX 480 for that project. Has anyone tested the CUFFT accuracy on Fermis yet? I need to achieve something very close to Matlab. If not, maybe someone can point me to another CUDA FFT library I could use instead (I need to do a lot of 1D C2C/Z2Z FFTs of size 2048 in parallel like using batch in CUFFT)…

Thx in advance!

I’ve also tried cufft on gtx 285. In my case the relative error is satisfying, very close to machine precision eps (for z2z, it is 1e-16). If the error is enlarged greatly, it may be because your algorithm is not that stable, or the problem is ill conditioned.