I’ve tested several Apps in the GPU Computing SDK, such as the GrabCutNPP, radixSort, etc. Surprisingly I found the GTX680 is even slower than my old GTX480 (about 0.9x). Why could this happen? In contrast, the test on 3DMark11 reported that the GTX680 is 2x faster.
The installed driver is 301.10, with a CUDA Toolkit 4.26. My OS is Windows 7 SP1. I even compile the code using compute_30 and sm_30, but the result kept the same.
ps: I couldn’t find a developer version driver that supports GTX680.
It remains to be seen if NVIDIA has pulled a bait and switch and we will all need to move to Teslas, or if, at some point, the 680’s big brother will show up (on the gaming side) and be faster than the 580.
I really want to dig into some microbenchmarking to understand if this is a compiler problem due to the switch to static instruction scheduling, or something else. Unfortunately, all these gamers have snarfed up the supply of GTX 680s, and mine is now back ordered until the first week in May. :)
I too am very curious about how code tuned specifically for Kepler can perform better than people have been reporting in the forums, and if future revisions of CUDA 4.2 will provide better performance. But I too have been unable to get access to one yet.