What's the meaning of performance counter: lts__t_sectors_srcunit_ltcfabric

Hi all,

Now I’m trying to analysis the performance counter collected by ncu of a GeMM task. And I encounter a counter: lts__t_sectors_srcunit_ltcfabric_evict_normal_lookup_miss, I look up “ltcfabric” and “srcunit” in the document (Kernel Profiling Guide :: Nsight Compute Documentation), and there’s even no apperance of these two words. I wonder what’s the meaning of this counter.

The ltcfabric is the communication fabric for the L2 partitions that were introduced in Ampere A100. For more information see the “A100 L2 Cache” section in the white paper https://images.nvidia.com/aem-dam/en-zz/Solutions/data-center/nvidia-ampere-architecture-whitepaper.pdf That metric would be related to L2 accesses that access the other partition.

lts → L2 cache slice
t → T stage on L2 cache
sectors → sectors count(what you will collect)
srcunit → source hardware unit which send data to current hardware
ltcfabric → L2 cache communication

So, srcunit_ltcfabric means another L2 partition send data to current L2 partition, and this metric will collect related the number of sectors sent.

That’s correct