L2 Cache Throughput

What factors will affect L2 reads and writes throughput? I have a kernel whose L2 reads throughput is much larger than (1000 times) L2 writes.

I did not use much shared memory in my application.