Weird Number for L2 Cache Hitrate

The L2 cache is a shared resource in the NVIDIA GPU that is accessed by many different units. A number outside of 0-100% implies that the metric was not able to be collected accurately. This out of range value generally occurs when the workload submitted has one or more of the following properties:

  1. Launched kernel is too small to saturate the GPU.
  2. Launched kernel has very different work per CTA.

The example above appears to issue very littel work (1). Out of range metrics often occur when the profiler replays the kernel launch and the work distribution is significantly different. A metric such as hit rate (hits / queries) can have significant error if hits and queries are collected on different replays and the kernel does not saturate the GPU to reach a steady state (generally > 20 µs). The other cause of significant error can be when another GPU engine (display, copy engine, video encoder, video decoder, etc. access shared memory during the profiling session. If the kernel is small the other engine can cause significant confusion in the L2 results. The l2_hit_rate includes all clients. The l2_tex is limited to the target kernel as that will be the only engine using the L1/TEX unit.

Please increase the size of the workload such that it saturates the GPU. This should result in correct metrics.