Hi,
I used visual profiler on mac to profile a sparse matrix vector multiplication kernel. I found that the number of L1 misses * 15(num. of SMs) is not equal to the number of L2 requests. Even (num. of L1 misses + num. of L1 hits) * 15 < L2 requests. Can someone explain this?
L1 hits: 677342
L1 misses: 2.07111e+06
L2 requests: 1.23936e+08