Driver slows down software

My name is Tamas.
I have two problems.

  1. I was using a GTX 580 on a 32 bit Windowes XP pro (CPU Intel Core 2 Family 6 Model 15 Stepping, clock 1864 MHz, Memory 2Gbyte). With CUDA and SDK 4.0 and devdriver_4.0_winxp_32_270.81_general I wrote a small benchmark under Visual C++ Express 2008. The software generated a 65Kword (float) data block and performed an FFT in a loop. With my original setup the software did 1330 FFT/sec.

When I chenged to devdriver_4.2_winxp_32_301.32, the performance dropped to 260 FFT/sec (with everything the same as above, but the driver).
Going back to devdriver_4.0_winxp_32_270.81_general restored the original 1330 FFT/sec speed.

  1. The original reason a did get into this situation was that my company purchased a Gigabyte GV-N690D5-4GD Video and installed it on a Gigabyte Z77-D3H motherboard with Ivy Bridge CPU, 16 G RAM. We tried it with CUDA and SDK Ver 4.2 and devdriver_4.2_winvista-win7_64_301.32_general on a 64 bit Windows 7 OS. The desktop driver (provided by Gigabyte) did see the GTX 690, but the devdriver did not.
    We applied a dirty trick: we plugged in the GTX 580. The devriver recognized it, so we were able to install it, and after that it did see the GTX 690, but the benchmark performance was poor as well.

This is why I moved back to the small 32 bit XP system where the devdriver_4.2_winxp_32_301.32 driver did not see at first the GTX 690 so I had to do the same trick.
The performance of the GTX 690 on the Windows 32 bit XP system was about 780 FFT/sec what is about 3 times of the GTX 580 performance (that seems to be OK, because of the 3 times more cores, but both the 580 and 690 performance is way under the 1330 FFT/sec with the new development driver).

Than you for your help in advance.