L2 Texture Cache

Hello everybody,

i’d like to know the size of the texture L2 cache and how many of these blocks are in the architecture.
I’ve looked for this info everywhere, but i didn’t find anything official.

My card is a GTX295 (with 240 SPs and 30 SMs).

Thanks

L2 cache size is 0kb (no L2 cache on second generation, only on fermi). What you do have is 16kb Shared memory per multi core (you can call it user controlled cache) and I believe 8kb texture non-coherent read cache per 3 multi cores (documentation is not 100% about exact sizes, I believe that it is split into global and local caches form the papers) and 8kb constant memory cache.

L2 cache size is 0kb (no L2 cache on second generation, only on fermi). What you do have is 16kb Shared memory per multi core (you can call it user controlled cache) and I believe 8kb texture non-coherent read cache per 3 multi cores (documentation is not 100% about exact sizes, I believe that it is split into global and local caches form the papers) and 8kb constant memory cache.

thank u for the answer, i’m sorry but info u can find on forum is not clear, for example i’ve found this about GeForce 8800

http://forums.nvidia.com/index.php?showtop…rt=#entry296990

and 8kb per 3 SM in GTX seems very small to me…

thank u for the answer, i’m sorry but info u can find on forum is not clear, for example i’ve found this about GeForce 8800

http://forums.nvidia.com/index.php?showtop…rt=#entry296990

and 8kb per 3 SM in GTX seems very small to me…

You won’t find anything official from Nvidia about this, but (on 1.x devices) there seem to be 32kb of L2 cache per 64 bit of memory bus width. For the GTX 295 that would amount to 224 Kb L2 cache per device.

You won’t find anything official from Nvidia about this, but (on 1.x devices) there seem to be 32kb of L2 cache per 64 bit of memory bus width. For the GTX 295 that would amount to 224 Kb L2 cache per device.

Without anything official i’m finding difficulties to deal with texture and evaluate performance…

Also in cudaprof, are tlb hit/miss the counters i should evaluate for texture?

Without anything official i’m finding difficulties to deal with texture and evaluate performance…

Also in cudaprof, are tlb hit/miss the counters i should evaluate for texture?

Yes, the tlb hit/miss is for texture caches. Don’t know if it talks about texture L1 or L2 caches.

In the gt200 architecture papers there is a diagram and some information regarding the L1 and L2 texture caches.

http://www.nvidia.com/docs/IO/55506/GeForc…nical_Brief.pdf

I found this link that may be of interest

http://forums.nvidia.com/index.php?showtopic=54679

Also look here for extensive measurements.