I’m not sure if this was discussed before, the forum search didn’t yield any results; I’ve found a fairly simple, and almost idiotic way to tell if a kernel is compute-bound, or memory-bound.
This works on Windows, I’m not sure if similar tools are available on Linux.
By installing NVIDIA System Tools, one has acces to GPU cough underclocking.
Measure the kernel execution time or performance under the following scenarios:
- Default memory and shader clocks
- Default memory and lowered shader clocks
- Lowered memory and default shader clocks
If the kernel performance is lower with the lowered shader clock, then the kernel is compute bound, and vice-versa.
For example, on my 9800GT I lowered the memory clock from 950MHz to 273MHz, and the kernel performance was identical in both cases, but any modification of the shader clock causes a proportional change in kernel performance.
Of course, there is the posibility that both memory and shader changes will cause a reduction in kernel performance, in which case the kernel is “balanced”.