I’ve recently been using Tegra Profiler 1.2 to watch the cache behavior of my computations as the size of the dataset increases. From the looks of it, I’m causing many L1 spills into L2 (possibly exacerbated by the pseudorandom cache line replacement policy in L1). I’m seeing L1 writes missing almost exactly as many times as L1 read misses, which is strange because most of my dataset is modified in-place, so those cache lines should ideally already be in L1. L2 reads and writes behave as I’d expect (that is, the L2 write miss count is significantly lower than the L2 read miss count).
Anyway, this brings me to my question. Is there any way that you guys could expose functionality to track the actual L1/L2 hit rate (not the miss count)? Naturally, when I’m crunching more data, I miss more often, but I have no way of knowing whether I’m also hitting more (thus keeping my hit rate constant).