basic texture cache question texture cache: inter- or intra- block?

sdj256 · January 30, 2008, 3:23am

Where are texture cache(s)? Can texture cache provide a way to localize data between blocks?

MisterAnderson42 · January 30, 2008, 4:16am

In principle: yes. It is a cache after all.
In practice: not really. My experiences and testing indicate that to maximize the performance of the cache, you really only need data locality within the threads of each individual warp.

With data locality among each warp, even semi-random access patterns can achieve 70 GiB/s.

sdj256 · January 30, 2008, 5:16am

Thanks - in your experience, cache only works per-warp then? 5.1.2.3 agrees:

"…The texture cache is optimized for 2D spatial locality, so [b]threads of

the same warp[/b] that read texture addresses that are close together will achieve best

performance…"

That makes the cache sound per-warp and not global, but I’m hoping for a cache that can span multiple blocks. Guess it doesn’t work that way!

wumpus · January 30, 2008, 7:54am

The cache is very small (8k, I remember), and mainly exists to be able to do fast (bi|tri)linear interpolation. Even if it is global you probably won’t notice that between blocks for this reason.

MisterAnderson42 · January 30, 2008, 2:05pm

Yeah, the “per warp” is sort of a natural result of the small cache combined with the interleaved execution of warps. Each multiproc can run 24 warps concurrently: say each thread loads a float4. 24*32 * 16 = 12 288, so we’ve already exceeded the 8k cache. Presuming the texture “cache” operates like a standard cache with some kind of replacement policy for bytes, by the time the last of the 24 concurrent warps runs its data load, values from the earlier warps have already been flushed to make room for others.

In discussions with tachyon_john and others, we came to the conclusion that a term something like “uncoalesced memory reader” was a better term for the texture cache. Maybe “almost coalesced memory reader” would be even better to hint at the needed data locality in a warp’s accesses.