L2 cache in A100 provides 179% hit rate!

A short, complete test case, and the full output from your ncu cli session (not just the memory workload and occupancy sections.)

Also see here. That is the most likely cause. If its observed from your test case that your kernel launch does not saturate the GPU, then the response will be the same: increase the GPU workload to saturate the GPU. (Or just ignore the L2 cache hit rate number.)

And if this devolves into a profiler behavior discussion, I will direct you to the profiler forums, as already indicated.