For instance, for the RTX 3080ti, the specification reports a core boost clock of 1670 MHz. However, using nvidia-smi --lock-gpu-clocks=2100, I was able to achieve 1965 MHz.
We might need to generally cover the domain of overclocking (which I won’t try to exhaustively do here). A few points seem important to me:
workload conditions matter. You say “I was able to achieve 1965MHz” but don’t say under what workload conditions. My guess is idle, but even if not idle, I doubt you can “achieve 1965MHz” if you are running an AI/DL training workload, or equivalently, doing back-to-back large SGEMM operations (or tensorcore GEMM ops). The back-to-back large GEMM operations will probably drive your “achieved clock” down much closer to base clock. That has been my experience, anyway.
exact GPU matters - not all 3080ti GPUs are “the same”. The die itself may have (power-)behavioral variation, unit-to-unit, and furthermore a specific end-user variant (i.e. product SKU) may vary from one SKU to another SKU(e.g. different cooling solutions). Even variation such as the effectiveness of heatsink TIM material from one unit to another could possibly result in two otherwise “identical” GPUs behaving somewhat differently.
The general idea behind boost is that it is an indication of what frequency the GPU can sustain under certain real-world conditions, across a range of instances of a GPU type (manufacturer sku, GPU part, etc.) . What are those certain real world conditions? I don’t know, but they are intended to be useful for general comparison - a specific goal of your first link - not a guarantee or crisp testable specification in all/any cases.
The difference, then, between what is published by NVIDIA Marketing as a boost frequency for comparison, and what you observe in a specific case, is probably related to specifics of your GPU as well as specifics of the workload. Since I don’t have a workload spec to offer you, you may want to look at that as a general comparison number, not a guarantee. Since GeForce GPUs serve the important gaming market (as well as other markets), my guess would be that the boost frequency represents an estimate of what frequncies may typically be observed in practice for a set of gaming (and possibly other) workloads, which are typical for the types of use-cases that GeForce GPUs are commonly used for.
A thread like this is prone to attracting reports of “exceptions” or observations that don’t seem to conform to the impressions I have shared. I won’t be able to respond to individual cases or reports. The guidance I have offered may or may not be useful. It’s not a guarantee. YMMV.
Based on observations with various actively cooled GPUs the nominal boost clock appears to represent some sort of “target frequency” used for the thermal and power management of the GPU.
This management can control (1) clock frequencies, (2) voltage, (3) fan speed. There is a physical dependency between clock frequency and voltage: the former can generally not increase without the latter also increasing, as transistors switch faster at higher voltage. There are also physical limits on the voltage. For current CMOS technology: around 0.7V minimum for reliable switching of transistors, around 1.05V maximum to prevent damage to, and premature aging of, the transistors.
The sensor data that goes into the management mechanism’s decisions is (1) temperatures of various GPU components (2) power draw (3) voltage stability.
For example, on my RTX 4000 the nominal boost clock is 1545 MHz, but the maximum achievable boost clock for short bursts of work is something like 1920 MHz. When there is a heavy sustained work load on the GPU, the GPU heats up and the fan speed increases. However, the fan only spins fast enough to ensure that the nominal boost frequency can be reached without hitting the thermal throttling limit (something like 83 deg. Celsius for this model). Even if a further increase in fan speed would allow a higher boost clock to be applied without overheating, that does not happen. Conversely, if the thermal limit is reached at clock speeds below the nominal boost frequency, fan speed is regulated up further and can reach 100% of maximum. In my observations, that second scenario can happen in one of two cases: (1) heavy dust accumulation on the fins of the GPU heat sink (2) very high environmental temperature.
Keep in mind that the behavioral description above is (1) based purely on observations, is (2) based on a small number of samples (a grand total of three different GPUs), and (3) could change at any moment with a driver update.