Different betweent in lts__t_sectors_srcunit_tex_op_read.sum and lts__t_bytes.sum

Hey, I was profile l2 cache data of my kernel with ncu tools, but I don’t know the difference between metirc lts__t_sectors_srcunit_tex_op_read.sum and lts__t_bytes.sum. They have different value in ncu result. I did’t found any detailed explanation in ncu documentation.

Another question is how can i get detailed explanation for each ncu metrics, and even some of metrics does not show in documentation, e.g. lts__t_bytes.sum.

Hi, @Rookie_programmer

You can use “ncu --query-metrics-mode suffix --metrics lts__t_sectors_srcunit_tex_op_read” to get the details.

Hi, @veraj , the output seems too simple.

ncu --query-metrics-mode suffix --metrics lts__t_bytes output as follow:

--------------------------------------------------------------------------- --------------- --------------- --------------------
Metric Name                                                                 Metric Type     Metric Unit     Metric Description
--------------------------------------------------------------------------- --------------- --------------- --------------------
lts__t_bytes.avg                                                            Counter         byte            # of bytes requested
lts__t_bytes.avg.peak_sustained                                             Counter         byte/cycle      # of bytes requested
lts__t_bytes.avg.peak_sustained_active                                      Counter         byte            # of bytes requested
lts__t_bytes.avg.peak_sustained_active.per_second                           Counter         byte/second     # of bytes requested
lts__t_bytes.avg.peak_sustained_elapsed                                     Counter         byte            # of bytes requested
lts__t_bytes.avg.peak_sustained_elapsed.per_second                          Counter         byte/second     # of bytes requested
......

ncu --query-metrics-mode suffix --metrics lts__t_sectors_srcunit_tex_op_read output as follow:

--------------------------------------------------------------------------- --------------- --------------- --------------------
Metric Name                                                                 Metric Type     Metric Unit     Metric Description
--------------------------------------------------------------------------- --------------- --------------- --------------------
lts__t_sectors_srcunit_tex_op_read.avg                                      Counter         sector          # of LTS sectors from unit TEX for reads
lts__t_sectors_srcunit_tex_op_read.avg.peak_sustained                       Counter         sector/cycle    # of LTS sectors from unit TEX for reads
lts__t_sectors_srcunit_tex_op_read.avg.peak_sustained_active                Counter         sector          # of LTS sectors from unit TEX for reads
lts__t_sectors_srcunit_tex_op_read.avg.peak_sustained_active.per_second     Counter         sector/second   # of LTS sectors from unit TEX for reads
.......

I still don’t know the difference between these two matric. What’s different in bytes requested and LTS sectors from unit TEX for reads?

Because i want to know how much data did kernel read in L2 cache, which metirc should i used?

Hi @veraj , Can you answer this question? I feel very confused.

The lts__t_ metrics are calculated at the tag stage. The tag stage can accept 1 request per cycle that consists of 1-4 sectors. The operation can be read, write, atom, atom, red, evict, … The tools can filter request by src (srcnode or srcunit), operation, and aperture (device, sysmem, peer).

If no srcunit or srcnode is specified then the metrics covers all requests.

Please note the use of .sum vs. .avg. .avg is only useful for .avg.pct_of_peak_elapsed_sustained or to look at loading balancing through comparison of .min and .max vs. .avg.

lts__t_sectors.sum - total number of 32-byte sectors requested for any operation.
lts__t_bytes.sum = lts__t_sectors.sum x 32 bytes/sector

lts__t_sectors_src_unit_tex_op_read.sum - total number of 32-byte sectors requested by src unit tex (this is all l1tex including local, global, surface, and texture) and operation type is read from any aperture (device, sysmem, peer).

If you want to get the equivalent in bytes then

derived::lts__t_bytes_src_unit_tex_op_read.sum = lts__t_sectors_src_unit_tex_op_read.sum x 32 bytes/sector.

The difference lts__t_sectors.sum - lts__t_sectors_src_unit_tex_op_read.sum is the number of sectors requested from other units or from tex (l1tex) for operations other than reads.

This topic was automatically closed after 11 days. New replies are no longer allowed.