Hello. I have questions about nsight compute report.
In the attached figure,
Looking at the attached figure, it can be seen that the number of store sectors going from l1 to l2 is 10,485,760 (box 1), but the store sector misses to device occurring in L2 is 0 (box 2).
However, in the case of the store of device memory, it is 333,152 sectors (box 3).
From which metric can this 333,152 value be inferred?
And similarly, I wonder where load sectors 4 (box 4) of device memory is a number that can be inferred.
The 10M sectors going to L2 can be seen in the second row “Sectors” column in L2 Cache table. Those numbers match. Then there are no misses to Device (box 2) because of the write-back cache policy. The 333,152 stores to device are likely from evictions in L2 that have to be written back to device memory when they are evicted. For the small 4 number, there are more consumers of DRAM than just the L1/L2 listed here. For example, the instruction cache. Those currently aren’t tracked here. So you can see some additional DRAM traffic that isn’t shown in the L1/L2 tables.
Thanks for your reply.
As you said, 333,152 Store Sectors in the Device Memory Table appear to have occurred during L2 Cache Eviction, but how many Evictions occurred cannot be inferred from the L2 Cache Table, right?
That is correct. We don’t have eviction numbers.
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.