GPU cache hit fluctuation problem

Ncu mainly uses the pmu unit for statistics. The pmu unit is limited, so multiple replays are required to measure all the parameters. You can use operator serial, clock frequency lock, and clear the cache during multiple runs to avoid the impact of the replay process.But the results we measured with ncu still have various abnormal cache hit fluctuations, and there is no way to explain.I asked a person from nvidia on the forum before. He said that as long as the graphics card is not being used by others, the cache hit rate theoretically will not fluctuate, but we just can’t reproduce it.
How can I explain this fluctuation in cache hits?
And, what does the “steady state” in the official document actually mean?
https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html#range-and-precision

1 Like