Cache hit rate analysis

I’ve recently been using Tegra Profiler 1.2 to watch the cache behavior of my computations as the size of the dataset increases. From the looks of it, I’m causing many L1 spills into L2 (possibly exacerbated by the pseudorandom cache line replacement policy in L1). I’m seeing L1 writes missing almost exactly as many times as L1 read misses, which is strange because most of my dataset is modified in-place, so those cache lines should ideally already be in L1. L2 reads and writes behave as I’d expect (that is, the L2 write miss count is significantly lower than the L2 read miss count).

Anyway, this brings me to my question. Is there any way that you guys could expose functionality to track the actual L1/L2 hit rate (not the miss count)? Naturally, when I’m crunching more data, I miss more often, but I have no way of knowing whether I’m also hitting more (thus keeping my hit rate constant).

hi alcom,
I am sorry for it, the current release don’t have this feature. But we are considering your question, which are tracked internally.

When you collect the numbers on any A9-based device, there is no way to distinguish between L1 read miss and L1 write miss. Cortex-A9 provides you with only ‘L1 cache refill’ event which can be caused by both read and write miss. Cortex-A15 provides you with two separate events for L1 read miss and L1 write miss.

To deal with that, Tegra Profiler does the following, L1 read miss = L1 refills / 2, L1 write miss = L1 refills / 2, which is exactly what you’re observing.

You could argue that this is incorrect behavior, but it will perform better on future devices.