Is there any way we can check the cache size of texture memory on any video card? I am using Tesla c1060, 9800GT, and 9800 GX2.
deviceQuery does not return this information, and cudaDeviceProp does not include a field to specify this information.
Another question is: Is the cache of texture memory distributed to the 30 multiprocessors, each multiprocessor has an equal amount of
cache for texture memory access? The cache is not combined together?
Wait for Fermi: the cache is far superior to previous generations and is well documented already in the whitepapers.
As for the cache size on the current hardware: SM’s are grouped into TPC’s (texture processing clusters). The TPC contains the cache. On G80/G92, there are 2 SM’s in each TPC and in G200, there are 3. As for the total cache per TPC, I haven’t seem that number except for the per SM cache number you already referenced.
Ask yourself this, does it really matter? 8k is already too small to get any kind of temporal locality out of the cache, so the difference between 6 and 8 is tiny when it is really just used as an efficient uncoalesced memory reader.
So, are the caches of SM’s within 1 TPC combined together? Suppose there are 3 multiprocessors in 1 TPC, each has 8kb of texture cache, then do these 3 multiprocessors share a 24kb texture cache in 1 TPC? and the 3 multiprocessors can all access these 24kb of data very efficiently in uncoalesced manner through a shared bus or sth like that?
I hope I understand the concept of TPC and texture caches, because this is the first time I have heard of this term.
Actually, the size of texture cache really matters if we want to tile the uncoalesced memory accesses. Suppose in an application, uncoalesced access to some part of data is unavoidable.
We can partition this part of data to small tiles and let each tile fit in texture cache. And the access to each tile is frequent, we can save a lot of time by the cache.