Measuring GB/s Do texture reads count?


When measuring GB/s, must the GB read from texture be taken into account? IMO, if they are counted, the GB/s will be bloated due to the texture cache, but if they are not counted, the GB/s will be underrated, so I don’t know what to do.

It’s not trivial to analytically calculate bandwidth usage if some accesses might hit the cache. That’s true for any architecture.
Unless you know the cache hit rate.

There’s one easy way - use the profiler.

At least up to CUDA 2.2 the profiler doesnt count texture accesses :)…mp;#entry527577

look at post #8 onwards…


Mind you, not counting texture accesses will artificially make the bandwidth lower, since texture cache misses also consume bandwidth…

I find that assuming every single read is a cache miss usually gives reasonable memory bandwidth results. The cache is so small, it does not really work as a bandwidth multiplier. It works more as an ‘efficient uncoalesced memory reader’.