Some technical papers analyze their GPU bottlenecks by varying the GPU memory clock. If you graph your CUDA application speed versus the GPU memory clock rate, the slope of the graph can give you an idea if you’re bandwidth bound or not. (It’s not perfect, but it IS interesting… certainly if you boost memory speed by 20% and there’s no app speedup, your app is likely not memory bound.). Similarly, you can change the shader clock rate and graph application speed.
What’s the best way to change your GPU memory and shader clocks for testing? I’m not even talking about overclocking, though that’s likely highly related.
The most popular tool seems to be Riva Tuner: http://www.guru3d.com/index.php?page=rivatuner
But for development, we’d like to make a GRAPH, so each clock change would have to be done manually and loop re-run.
Is there some nice script or API call we can use to change the clocks from our app itself and let it run the tests sequentially without manual clock adjustments for every sample?
And, I am pretty sure this is NO, but has anyone heard of any tools that let you disable SMs? IE, tell your GTX280 to use only 27 of the SMs and not 30, making it emulate a GTX260-216, etc. That would be interesting for bottleneck analysis too.