L2 persistence clarifications

Robert_Crovella · February 12, 2024, 5:10pm

There are at least 2 places where there is some documentation on L2 persistence, in the programming guide and in the best practices guide.

I haven’t studied your post in great detail, but in going through it more than once, for me, I distill it down to a couple things:

Probably best to take the documentation at face value
Wondering about what happens to kernels launched into streams that have no persistence specification.

I’m quite certain that if a cache line is pulled into the L2 (and not evicted), a subsequent request for data in that cache line will hit in the L2.

Beyond that, the behavior of overlapping regions or simply multiple streams sharing the same L2 persistence carveout is not fully specified, and not sure it ever will be. The cited documentation makes reference to this case, and does not give a precise answer but mentions “sharing” the carveout. Given that the mechanism seems to suggest a probabilistic aspect, I’m not sure that a detailed access-by-access specification will ever be given. (For example, for a hit ratio less than 1.0, no formula is given that I can see to precisely determine a-priori whether a given address will be cached (i.e. will persist) or not.)

If you launch a kernel into a stream that doesn’t have a persistence “spec”, then there is no reason to assume that data requested by that kernel will make it into the persistence region (i.e. the carveout). It may, but I don’t see a detailed description of this. If it matters to you, you should probably provide a persistence spec for that stream.

You’re welcome to file a bug to request any documentation clarifications you would like to see.