When I profile my application using nvprof 5.0
I see 90% L2 cache read hits but 3% L2 cache write hits.
I basically read from a global array and do some computations and write back to the same array.
Can anyone help me figure out what causes 97% L2 cache write misses. I expected both read and write L2 cache hits to be almost the same.
appreciate your guidance.
GPU: GTX 480
I have skipped the L1 cache by using compiler option “-Xptxas -dlcm=cg”