MisterAnderson42,
thanks for the link. The test is well written and easy to read. The read_only_gmem<> and read_only_tex<> delivers much more than half of the peak. I am very convinced :)
In most of the examples we see a big mismatch between the theoretical and actual performance with respect to the bandwidth performance.
GTX280 theoretical bandwidth is 141.7GB/s, but the actual one reported using the benchmarks scripts is about 70 GiB/s (only 50% of the theoretical benchmark)
I have a TESLA card whose theoretical bandwidth is 102GB/sec, but on running the bandwidth benchmarks i get about 71.662087 GiB/s (about 70% of the theoretical benchmark)