Hi, everyone.
I bought a GT240 and started “playing” with CUDA last week. I made my first few tests, then a simple Mandelbrot zoomer (I blogged about it and released the GPL code on my site).
I like the Cuda API, it’s nice and clean - but I just realized that there’s a major speed difference between the two most popular OSes:
My Mandelbrot code runs at around 200fps under my Linux
My Mandelbrot code runs at around 400fps under Windows XP
In both cases, I disabled VSync to get the maximum frame rate.
I am using Cuda 2.3 (the stable version, that is) under both OSes.
Since I use an OpenGL Pixel buffer to draw in, all operations are done entirely in card space… so I was puzzled at this. I thought that maybe the OpenGL implementation under Windows is so much more optimized that it runs circles around the Linux one - so I commented out the code that draws the generated data using the texture…
Speed when only doing calculations, not drawing, under Linux: 400fps
Speed when only doing calculations, not drawing, under Windows: 680fps
Again, even for pure computations, CUDA under Windows runs a lot faster.
Perhaps my code is buggy somehow? I tried the official nbody sample from the SDK:
Under Linux:
./nbody -benchmark -n=30000
Run “nbody -benchmark [-n=]” to measure perfomance.
30000 bodies, total time for 100 iterations: 19701.262 ms
= 4.568 billion interactions per second
= 91.365 GFLOP/s at 20 flops per interaction
Under Windows XP:
Run “nbody -benchmark [-n=]” to measure perfomance.
30000 bodies, total time for 100 iterations: 12137.919 ms
= 7.415 billion interactions per second
= 148.296 GFLOP/s at 20 flops per interaction
So it’s not just my code… For some reason, my GT240 runs at least 60% faster under Windows XP.
Any ideas why? Is this a driver bug? It seems weird, since when doing pure calculations in the card’s global memory, I would expect that nvcc does pretty much the same work under Windows/Linux.
Any people out there doing serious computations using Linux/CUDA, and having seen this?
Thanks for any help,
Thanassis Tsiodras, Dr.-Ing.
P.S. Under Linux, my nvidia-settings, in the PowerMizer option, shows 3 Performance levels, with the 2nd one (i.e. not the best) selected. The other two options appear to be disabled, and selecting “Preferred mode: Maximum performance” doesn’t impact this selection.
Could it be that under Linux the card operates under lower clock frequency because of this?