GTX280 versus 8800GTX running CUFFT The 8800GTX seems to compute FFTs faster than the GTX280. A fluk

I am benchmarking the cufft code before incorporating it into a bigger project. My results show that an 8800GTX on a Mac actually outperforms a GTX280 on Linux. I have two GTX280s and get comparable results on them. My results are also in line (including the dip at log_2 N=9) of Naga Govindaraju (who unfortunately is not willing to release his code. Yes, I did get V. Volkov’s)

Can anybody explain why the 8800GTX is better than the GTX280 for larger FFTs? Memory bandwidth?
I was surprised.

Here are my results:

Which CUDA version?

AFAIK, there is no 8800GTX for Mac.

Oops, you are correct, it is the 8800GT for the Mac. I guess that X just crept in (rather consistently I guess).

The Mac’s have the 2.3 release, the Linux machine 2.2. Should I look into upgrading the Linux system to 2.3?

I think there were significant CUFFT improvements in 2.3, yes.