Before I upgraded from CUDA 2.2.to 2.3 I wrote a small FFT bench to see how the new release performs. I did not expect much difference, but I found that especially for larger FFT sizes there’s pretty much a gain (~factor of three) when using the newer CUDA version. Can anybody else confirm this behavior? Is the new FFT library running with more sophisticated algorithms? What boosts the performance that much?
Results are documented here.