SM clock 1.35G vs. GPU core 575MHz?

Just see some post says that SM clock rate is higher than core clock.
So how this affect #cycles per warp instruction? And when estimating performance, why we should use 1.35G instead of core clock rate?


This is because Cuda uses the shader proccessors so it uses the shader clock. Each differenct card has a different shader clock rate. For the 8800GTX and the Tesla it is 1.35G but Overclocked versions of the GTX or other model cards have different rates.

The clock in “clock cycles per warp” in the Programmers guide is the shader clock.

A warp of 32 threads takes 4 clock cycles to propagate through the shaders, so the core clock (which controls the instruction decoder) does not need to be as fast. You can’t control the core clock and the shader clock separately, so their exact ratio is set by some internal hardware requirements that people outside NVIDIA aren’t privy to. For performance estimates, I don’t think you need to pay attention to the core clock at all.

Actually its possible to set different shader/core ROP speeds in Rivatuner for windows, and using NVClock for Linux.

But seibert is right about the core clockspeed - it makes very little difference when I slashed my 8800GT from 650 to 300Mhz running n-body simulation within the NVIDIA SDK.

Unlike changing the shader speed, which had a huge impact on the n-body executable.