Hello,
I have a CUDA program that calls a kernel repeatedly which I time with the cutil timer. I noticed that the time changes every time, and is either around 3.5 (+/- .01) ms, or 12.5 (+/- 0.08) ms. After two days of trying to find the problem, I checked the device properties with cudaGetDeviceProperties, and when it runs slower, the clockRate is 601710 Hz, and when it runs faster, it is 1512000 Hz. There is really no pattern as to when it runs fast or slow, but generally if I let it sit for a minute then run the program, it will be slow. Then the next few will be fast, then randomly either fast or slow.
I am running this on a Mac Pro in leopard with a 8800GT card. Is there a way to disable power management (which I assume is the problem), or programmatically set the clock rate to work as intended?
Thanks!
Hmm…is there any kind of a work-around? I mean, this seems like it could SERIOUSLY screw folks up who are doing things that are timing-dependent (e.g., me). I have hundreds of simulations that only take about a few minutes to run if the clock runs at the right speed, but if it doesn’t, then my results are completely invalid, since the whole point of the simulation is to compare CUDA speeds and CPU speeds. Then I need to re-run the ones that did not run at the right speed, which probably won’t run correctly again. It ends up taking 2 hours to run what should only take 3 minutes total.
Does anyone know if this problem exists on the windows or linux versions, or in older versions of the driver? Are there 2.2 beta versions available anywhere, or a date when we can expect them? This stuff is going in my dissertation, which is done in a month, and, well, it would be nice if it worked :-).
I just tested the program on my Macbook pro, and it has the same problem as well: it either runs at 750000 or 933330. Out of curiosity, do you actually check your clock rate every time you run a simulation to make sure it is running at the right speed?
I guess I’m trying linux on monday.
Thanks!
EDIT: I just tested a few other SDK apps, and output the clock rate for those tests to check if it was something I was doing. The clock rate varies for those as well, so it does not seem to be just my program.
No, I don’t actually check the clock rate. But my program does display performance data every 10s and it has always been consistent from run to run.
I just did a quick test running several SDK apps and checking the clock, it was always at 0.75 Ghz. (the card is an 8600M)
Anyways, if Tim says there is a problem, then there is a problem. The fact that my machine doesn’t reproduce it doesn’t matter. Lets hope for a quick release of CUDA 2.2 :)
Thanks for checking your system, I really appreciate it. I’m installing Linux right now, and I’ll report back in a little bit to report whether it works or not.
Well, I have installed linux and it appears everything is working fine now, and is actually a bit faster than in Leopard. I guess I will be finishing my tests in Linux! Thanks for all the help.