GPU overclocking tool

SPWorley · April 5, 2010, 2:53am

What are people using to change the shader and memory clocks of their GPUs in Linux?

There’s the simple and useful nvclock tool, but that has not been updated for over a year, and only works reliably for me for older boards GTX280 is OK but GTS295 is not, for example. It doesn’t even recognize my GT240 as valid… it says the clock rate is -273672872 or something.

The other overclocking method is to edit xorg.conf and add the CoolBits flag. But this only lets you overclock the DISPLAY gpu.

Is there some other tool out there I’m missing? I’d really just like some command-line tweaker like nvclock.

SPWorley · April 5, 2010, 5:19am

OK, there’s a small step I found.

If you are running X,edit your xorg.conf file to add Option “Coolbits” “1” to your driver.

Then, from the command line, you can change the shader and memory clocks with a line like:

nvidia-settings -a=[gpu:0]/GPU3DClockFreqs=576,1008

Arcane, but once you figure it out, this works great!

[b]Except it only works for the GPU that’s being used for display.

[/b] ARRRGGH, that’s the GPU we DON’T care about. It will not work for non-display GPUs.

I’ve tried manually adding all GPUs to xorg.conf and setting coolbits. Nope.

If I add dummy monitors to the GPUs… it works. But then the 5 second timeout is active, making it useless.

So… we’re half way there. We just need to figure out either how to activate CoolBits for non-display GPUs, or figure out how to disable the watchdog for GPUs the driver THINKS have displays.

SPWorley · April 5, 2010, 4:28pm

OK next step… the source code of nvidia-settings is GPL (yay!) and we can see that it’s all a nice wrapper around the NV-CONTROL X API library. So, can we call that API with our own code, or is it that API that’s refusing to adjust non-display GPUs?

I’m now doing the “compile wrangling” needed for testing.

Still I encourage input, suggestions, or experience! I just want to lower the shader clock rate on my GPU… I don’t want to branch into another programming subproject.

SPWorley · April 5, 2010, 5:04pm

I now suspect that the driver is what’s refusing to change the clocks unless the GPU is set to display, with the watchdog enabled.

This is one of those cases that byte-zapping reverse-engineering could likely provide a fix by correcting some misplaced “if” statement in the driver.

indy2718 · April 6, 2010, 6:46pm

Maybe you can use a Windows program to change the bios - clock rate. I have not used this. I am not sure if it is persistent.

http://www.softpedia.com/progDownload/NVID…nload-4977.html

SPWorley · April 6, 2010, 6:55pm

That won’t work well for my needs (I want to downclock and compare rates) but that’s still a very creative and interesting idea! I never thought about changing the BIOS itself. Very, very clever!

I’d expect it would be persistent since it’s updating the GPU’s BIOS EPROM itself.

That’s all the commercial “SuperClocked” “Turbocharged”, “Gold Sample” etc cards do, though likely they also do stability testing at the higher clocks/volts.

SPWorley · April 17, 2010, 5:22am

OK, a bump here because I’m still interested.

Let me re-ask the complementary question:

Is there a way in Linux to disable the watchdog kernel killer on cards used for display? Or at least change the timeout to be say 5000 seconds and not 5?

SPWorley · April 17, 2010, 5:22am

OK, a bump here because I’m still interested.

Let me re-ask the complementary question:

Is there a way in Linux to disable the watchdog kernel killer on cards used for display? Or at least change the timeout to be say 5000 seconds and not 5?

indy2718 · April 18, 2010, 2:54pm

Does the clock change persist when the X server is stopped? If so, try this:
make a minimal xserver, no window manager, no inputs.
make a shell script: start the X server, change the clock, kill the X server

indy2718 · April 18, 2010, 2:54pm

Does the clock change persist when the X server is stopped? If so, try this:
make a minimal xserver, no window manager, no inputs.
make a shell script: start the X server, change the clock, kill the X server

avidday · April 18, 2010, 3:39pm

It my experience they do not. When the X11 server exits, the driver unloads something, and all settings (fan, clock speeds, compute mode) get lost.

avidday · April 18, 2010, 3:39pm

It my experience they do not. When the X11 server exits, the driver unloads something, and all settings (fan, clock speeds, compute mode) get lost.

Qazax · April 18, 2010, 3:45pm

is there anything stopping you editing the BIOS of the card to slow the shader speeds down? it seems a viable solution has been offered…

Qazax · April 18, 2010, 3:45pm

is there anything stopping you editing the BIOS of the card to slow the shader speeds down? it seems a viable solution has been offered…

SPWorley · April 18, 2010, 4:07pm

Yes… in this case I am extremely picky since I want to dynamically change the rates and compare program efficiency at each clock rate and graph them.

But I am sure the BIOS method would work if you wanted a static base lock rate.

My recent theory: My GPUs at 100% utilization running complex CUDA code draw only about 2/3 of the wattage of the same GPU when it’s 100% loaded with intensive graphics, likely because the fixed function hardware is idle during CUDA. Seibert has found almost identical results for his own independent code. The clock rate cards are set to is based on the fastest the GRAPHICS use of the card can run stably.

But perhaps that means the CUDA only portion has more stability margin than the more intense graphics portions. If so, it may be that severe overclocks of a GPU (especially its shader clocks) would be 100% stable with CUDA even though any graphics app would immediately fail.

So I want to make a set of small CUDA kernels… do a small FFT, do an n-body, do some SHA1 code computes, solve some linear algebra matrices, sort some lists, etc, trying to get a variety of REPRODUCABLE CUDA kernels to run. Then in a script I’ll run each and check their results for ANY deviation, even a bit flipped. The expectation is that if the card is unstable, some kernels may show those errors as corrupted output. This is very much in the spirit of using Prime95 to test CPU stability.

Then finally, I want to put the CUDA tests into a script… run the set, check stability, then boost clock rate by 10MHz and try again etc until failure, then go back to the last stable version and run it in a torture loop for hours to check for robustness, etc.

Again there’s nothing new about this kind of test… I think there are tools for setting the “optimum overclock” for both CPU and GPUs. But I want one that is for CUDA only (ignoring graphics stability)… hence I want scriptability.

What I’ll probably do is write the above tool and just keep kernels fast and require the tested GPU to be a display so the nvidia-settings clock rate adjustment works.

The other use of such a tool has nothing to do with overclocking. It is a script which runs your custom kernel under different shader and clock rates and prints a shmoo plot of your program’s behavior. This can give a very coarse but robust clue whether your program is memory or compute limited, and how much. For this use, the display restriction really needs to be avoided.

SPWorley · April 18, 2010, 4:07pm