I installed CUDA 1.1 under Fedora Core 8 x86_64, and have been benchmarking my program (along with some of the other programs in the SDK) vs benchmarks on Windows with the exact same hardware.
Windows XP (32-bit): 1450.8ms
Linux (FC8, x86_64): 2554.2ms
So, how about the stock BlackScholes sdk example?
Windows XP (32-bit): 3.5ms
Linux (FC8, x86_64): 5.3ms
Linux is running at 55%-67% of Windows with the 1.1 SDK!
So, I installed 2.0 beta under Linux.
My Linux system exactly matches my Windows 1.1 results, resulting in an obviously big performance jump.
So, I installed 2.0 under Windows.
My program went from 1450.8ms to 1170.4ms under Windows. So, my Linux 2.0 benchmarks now match my Windows 1.1 benchmark. But my Windows 2.0 benchmark is now faster than my Linux 2.0 benchmark by quite a bit.
Bottom line, the 2.0 beta is definitely worth installing immediately, but the performance mismatch between Linux and Windows is puzzling.