Im using windows 7 x64 and VS2010 when I time my cuda program on windows it gives me weird results the program is just a simple read and write of threads. the results posted on windows 7 is about 8302.192 ms in ubuntu its 313.120 ms

can someone help me fix this timing result discrepancy.

I have q6600 cpu, 4gb of ram and 2 8600gt 512 gb of ram.



I don’t know much about Windows nor about the cards you use, but some strange timing like this I saw. And most of the time this was due to “starting” cuda or the card.
If you want to make up your mind, and if the card/OS supports it, try to put your card on “persistent mode” using on Linux “nvidia-smi -pm 1” as root.
If you can’t do this or if you don’t want doing it, you can still eliminate this initialisation time by calling a cuda function on your device prior to start the timing. The simplest one to use is cudaMalloc(&dummy, 0) where dummy is a pointer to whatever.
So, you call you zero-sized cudaMalloc and only after then you start your timer.
Does that make sense?