Texture cache characteristics 2D cache size

mschatz · May 2, 2007, 8:52pm

Can someone describe the capabilities of the 2D cache? Section 6.1.2.3 says it is optimized for 2D spatial locality, but I can’t find any details. For example if I have a 2D texture of ulong4, how many texels are in the same cache block? Am I correct in assuming the cache blocks are regular sized squares tiling across the texture? I am having good results empirically when the data is organized into 32x32 blocks, but I’m not 100% sure if this is just accidental. Do the cache block sizes depend on the memory type?

Thanks,

Mike Schatz

yk_cadcg · May 6, 2007, 6:59am

I’m not able to answer your question, but I prove 2 results I went through:

1, if randomly accessed, the 2d texture doesn’t surpass global memory. Even if the whole data size is less than 8k(cache size), texture cache helps little, the perf is the same with that in global memory. I can’t think of why.

2, the load from globalmemory → shared memory or register is block-based, not word-based. ie. One loading such as sm[tx]= (int) gm[i]; can bring more than what we want from global memory. So we have room to utilize spatial locality and think of a little cache.

fhm_felix · May 6, 2007, 10:13pm

I’m sorry I can’t give you an answer. But I got another question regarding texture cache:
Section 5.1 (General Specification) says “The cache working set for 1D textures is 8 KB per multiprocessor”.
I am confused about the “1D”. What does this mean? Only 2D-Textures get cached? Probably not.

Thanks for help,

Felix

yk_cadcg · May 7, 2007, 2:35am

yes, only 2D is cached, 1D is not. Please check Mark Harris’s reply to my former posts, if I’m not mistaken.

Simon_Green · May 7, 2007, 2:07pm

No, all types of texture accesses (1D/2D/3D/cube) are cached.

I think what Mark was trying to say is that textures are generally optimized for 2D coherency. 1D textures are essentially just a special of case of 2D textures where the height equals one.

prkipfer · May 8, 2007, 10:47am

1D textures are bound to linear memory. They are cached but exhibit obviously only coherency in the “storage direction” of the linear memory, which is basically the same effect that you get with coalesced global memory access.

2D textures are stored in an array memory layout and have a special 2D cache. No GPU vendor will tell you exactly how this works, but the effect is that accesses are faster in both dimensions now.

The texture cache works very well. See my post in the other thread. Using the texture cache in this example speeds up the execution by a factor of 2 per thread from 77220 to 45156 in the mean (screenshots 3 & 5).

Peter

Topic		Replies	Views
CUDA texture memory performance CUDA Programming and Performance	4	33529	January 13, 2009
Texture Memory ! CUDA Programming and Performance	3	7162	January 11, 2010
Textures: linear memory vs cudaArrays CUDA Programming and Performance	9	7780	October 16, 2007
Performance Considerations using Texture Access Does the performance depend on the access pattern? CUDA Programming and Performance	1	1392	August 21, 2009
Texture cache architecture Line size of texture cache CUDA Programming and Performance	3	2928	August 27, 2008
texture cache and L2 cache CUDA Programming and Performance	3	4189	March 19, 2014
Memory performance in image processing example CUDA Programming and Performance	9	1601	March 24, 2011
For what case should I use texture memory? CUDA Programming and Performance	8	2653	May 26, 2010
Texture memory performance CUDA Programming and Performance	4	4972	June 1, 2009
texture didn't win global memory texture cache CUDA Programming and Performance	6	4357	March 30, 2007

Texture cache characteristics 2D cache size

Related topics