Weird Number for L2 Cache Hitrate

morteza · April 20, 2020, 3:38am

I’m trying to get L2 Cache Hit Rate using Nsight Compute for a simple CUDA code and I’m using the following Section file:

Metrics {
    Label: "l2_hit_rate"
    Name: "lts__t_sector_hit_rate.pct"
}
Metrics {
    Label: "l2_tex_read_hit_rate"
    Name: "lts__t_sector_op_read_hit_rate.pct"
}
Metrics {
    Label: "l2_tex_read_transactions"
    Name: "lts__t_sectors_srcunit_tex_op_read.sum"
}
Metrics {
    Label: "l2_tex_write_hit_rate"
    Name: "lts__t_sector_op_write_hit_rate.pct"
}
Metrics {
    Label: "l2_tex_write_transactions"
    Name: "lts__t_sectors_srcunit_tex_op_write.sum"
}

However depending on which of these metric I include the output of Nsight is different. Here for each result, those metrics that are not included have been commented out:

l2_hit_rate                  %                         171.46

or

l2_hit_rate                  %                          97.93
l2_tex_read_hit_rate         %                           4.75
l2_tex_write_hit_rate        %                            100
l2_tex_read_transactions     sector                         0
l2_tex_write_transactions    sector                         4

or

l2_hit_rate                   %                         179.27
l2_tex_read_hit_rate          %                       2,418.75
l2_tex_write_hit_rate         %                            100

I was wondering what is going on here? More importantly what does larger than 100% mean? I’ve seen this behavior before for Utilization as well.

I have tried Nsight Compute 2019.5 and 2019.1 on two separate machine both running Ubuntu 18.04:

GPU:Titan RTX
Driver Version: 430.50
CUDA Version: 10.1

and

GPU: Quadro RTX 8000
Driver Version: 440.64
CUDA Version: 10.2

Greg · April 25, 2020, 7:54pm

The L2 cache is a shared resource in the NVIDIA GPU that is accessed by many different units. A number outside of 0-100% implies that the metric was not able to be collected accurately. This out of range value generally occurs when the workload submitted has one or more of the following properties:

Launched kernel is too small to saturate the GPU.
Launched kernel has very different work per CTA.

The example above appears to issue very littel work (1). Out of range metrics often occur when the profiler replays the kernel launch and the work distribution is significantly different. A metric such as hit rate (hits / queries) can have significant error if hits and queries are collected on different replays and the kernel does not saturate the GPU to reach a steady state (generally > 20 µs). The other cause of significant error can be when another GPU engine (display, copy engine, video encoder, video decoder, etc. access shared memory during the profiling session. If the kernel is small the other engine can cause significant confusion in the L2 results. The l2_hit_rate includes all clients. The l2_tex is limited to the target kernel as that will be the only engine using the L1/TEX unit.

Please increase the size of the workload such that it saturates the GPU. This should result in correct metrics.

Topic		Replies	Views
L2 hit rate >100% Nsight Compute	1	620	December 11, 2020
Nvprof and Nsight returning different results for L1 and L2 cache hit rates Nsight Compute	4	736	August 13, 2019
What is the way the L2 read hit rate is calculated? Nsight Compute	0	376	December 1, 2020
Nvprof and Nsight returning different results for L1 and L2 cache hit rates Visual Profiler and nvprof	0	867	July 8, 2019
L2 cache rate profiled in nsight compute is confused Nsight Compute	5	3299	July 3, 2024
L2 cache in A100 provides 179% hit rate! Nsight Compute	1	839	January 4, 2023
L1 hit rate stats according to nsight compute Nsight Compute	0	724	December 28, 2020
NCU profiling shows unexpected results Nsight Compute	1	319	April 18, 2025
L2 cache in A100 provides 179% hit rate! CUDA Programming and Performance	7	1608	December 25, 2022
Nsight compute "Sectors Misses to L2" greater than "Sectors" Nsight Compute cuda	2	554	September 27, 2021

Weird Number for L2 Cache Hitrate

Related topics