I profile a benchmark application which is 3mm using NVIDIA Compute Command Line Profiler on Nvidia QUADRO P620. Even though L2 cache hit rate is 81,96, L1 cache hit rate is zero.

I have try to using different application but results were similar: L1 Cache hit Rate is zero but L2 Cache hit rate is not.

However, I have tried this on another computer and It has GeForce GTX GPU. . L1 Cache hit rate was around 70. So application is not the problem.

I remember that L1 cache hit rate was not zero, so something must have change.

Is there any way to fix this?

Some GPUs have L1 enabled for global loads, some have it disabled for global loads. The Quadro P620 is a Pascal GPU of GP10x type and the L1 is by default disabled for global loads.

So I believe your observation could possibly be expected behavior (since you haven’t provided any code, and also haven’t indicated the exact model of GPU for the GeForce GTX GPU, this is just a guess.)

You may be able to “fix this” by specifying the compile command line switches in the article I previously linked (-Xptxas -dlcm=ca), to enable the L1 for global loads (an opt-in feature.)

