I have a question about the DRAM sectors metric, specifically dram__sectors.sum. My understanding is that this value should represent the number of L2 cache misses.
However, when I try to calculate it by multiplying the total number of L2 sectors (lts__t_sectors.sum) by the L2 miss rate (1 - lts__t_sector_hit_rate.pct), the result doesn’t match the value reported by dram__sectors.sum.
Here’s an example from a GEMM application:
lts__t_sector_hit_rate.pct (%) 99.37
lts__t_sectors.sum (sector) 27,430,992
dram__sectors.sum (sector) 98,312
Shouldn’t dram__sectors.sum be equal to lts__t_sectors.sum * (1 - lts__t_sector_hit_rate.pct/100)?
Why is there a discrepancy between these values? Am I misunderstanding how these metrics are defined?
dram__sectors.sum will include all read and write sectors at the memory controller.
lts__t_sectors.sum includes tag sector lookups which can hit or miss. This is focused on L2 data RAM accesses.
lts__t_sectors_lookup_miss.sum is not equivalent to dram__sectors.
Here are some considerations on the differences:
Requests to L2 that miss in L2 can result in both fill sectors and evict write-back sectors if selected line is dirty.
Writes (hit or miss) may generate write-through sectors to dram. This is not predictable as L2 is point of coherence a write-through is not always required for device memory.
Compressed reads/writes can result in additional dram traffic (on hit or miss).