"Dissecting GPU Memory Hierarchy through Microbenchmarking"

Might be interesting to some people here: