In particular the hit ratio is introduced as a way to avoid cache thrashing (ie. useful cache being evicted when the access window exceeds the persistence cache size and hit ratio is 1). In the micro-benchmark from the first link, a hit ratio < 1 gives a random chance to a memory access being either streaming or persistence. If hit ratio is 0.5, then half of all memory accesses to the access window are treated as streaming, and potentially useful cache is first to be evicted.
Given that the micro-benchmark code could be considered random memory accesses to the persistent cache (ie. no pattern), why does the lower hit ratio generate any performance boost, if cache is still being evicted?
The use of the word random there is approximately correct if we posit that the data access pattern is random. However there is not really a random chance that a given access will be “persistence” i.e. cached. There is a specific pattern based on requested address, it is not random.
Think of it this way. For an access window of 2MB, and a persistence cache size of 1MB, and a 0.5 hit ratio, then the data will be cached as follows, each X or Y is eight bytes:
So your code is accessing this region. If the access falls on an area indicated by X, it will be cached in the persistence cache (“persistence”). If it falls on an area indicated by Y, it will not be cached in the persistence cache (“streaming”).
This is an example that I advance for understanding. Please do not assume that the pattern indicated above is the exact pattern you should expect. It may be different. But the general idea is that some portion of the footprint will be cached, some portion won’t, and the ratio of cached to uncached region can be inferred from the supplied hit ratio.