Yes… in this case I am extremely picky since I want to dynamically change the rates and compare program efficiency at each clock rate and graph them.
But I am sure the BIOS method would work if you wanted a static base lock rate.
My recent theory: My GPUs at 100% utilization running complex CUDA code draw only about 2/3 of the wattage of the same GPU when it’s 100% loaded with intensive graphics, likely because the fixed function hardware is idle during CUDA. Seibert has found almost identical results for his own independent code. The clock rate cards are set to is based on the fastest the GRAPHICS use of the card can run stably.
But perhaps that means the CUDA only portion has more stability margin than the more intense graphics portions. If so, it may be that severe overclocks of a GPU (especially its shader clocks) would be 100% stable with CUDA even though any graphics app would immediately fail.
So I want to make a set of small CUDA kernels… do a small FFT, do an n-body, do some SHA1 code computes, solve some linear algebra matrices, sort some lists, etc, trying to get a variety of REPRODUCABLE CUDA kernels to run. Then in a script I’ll run each and check their results for ANY deviation, even a bit flipped. The expectation is that if the card is unstable, some kernels may show those errors as corrupted output. This is very much in the spirit of using Prime95 to test CPU stability.
Then finally, I want to put the CUDA tests into a script… run the set, check stability, then boost clock rate by 10MHz and try again etc until failure, then go back to the last stable version and run it in a torture loop for hours to check for robustness, etc.
Again there’s nothing new about this kind of test… I think there are tools for setting the “optimum overclock” for both CPU and GPUs. But I want one that is for CUDA only (ignoring graphics stability)… hence I want scriptability.
What I’ll probably do is write the above tool and just keep kernels fast and require the tested GPU to be a display so the nvidia-settings clock rate adjustment works.
The other use of such a tool has nothing to do with overclocking. It is a script which runs your custom kernel under different shader and clock rates and prints a shmoo plot of your program’s behavior. This can give a very coarse but robust clue whether your program is memory or compute limited, and how much. For this use, the display restriction really needs to be avoided.