CUDA on Windows much slower than on linux

Hi there, I have a CUDA application which I have to run on both Windows and Linux OS. However, the perfomance I get for the same test case on the same card (TESLA C1060) are very different: on Windows (Vista 64bit) the time needed to run my application is nearly double than on Linux (openSUSE 11.1 64bit). Do you have any idea? Many thanks,

Not knowing anything about the properties of your application or very little about your setup, my guess is that you are running with default WDDM driver on Windows, and this driver model has quite a bit of overhead. The CUDA driver tries to minimize the impact of this through batching of work, but it can still have a significant impact.

On Windows, you would want to use the TCC driver, which uses a different driver model and avoids most of the overhead inherent in the WDDM model.

Thanks njuffa, I have realized that I was using a different version of the cuda compiler (5.0) on Windows than that one used on Linux (4.2). Installing the 4.2 on Windows I get now “only” a factor of 1.3 slower.

Even the drivers are different, but I can use the same as the installer doesn’t recognize my 2 GPUs (I use a Quadro for visualisation and a TESTLA for computing).

How should I switch between WDDM to TCC driver?

Sorry, I have no first hand knowledge of using the TCC driver. Best I can tell switching to TCC mode is accomplished via the -dm option of nvidia-smi.

Strange, I have had the opposite experience with the GTX 680. Are you using Visual Studio? You do need to correctly configure the compiler settings and select x64 if you are using 64 bit operating system.
Without any optimizations I can get 934 Gflops out of the 680 using CUBLAS Matrix Mul, while in ubuntu I was only getting ~500 Gflops.

I would also recommend Nividia Nsight for use with Visual Studio. It is free and quite useful for debugging.

You probably have a lot of other differences too, compiler options etc. Maybe you use less powerful gpu on windows. If driver does not recognize second.