1D versus 2D textures Spatial Locality


I’d just like to confirm:

  1. is the spatial locality attributed to texture usage is only 1D locality when using linear memory,
    and 2D locality when using CUDA arrays?

  2. I have not found any reference to the degree of locality. When fetching an (x,y) element,
    how how many elements are fetched to texture cache? What is the “radius” of the neighborhood
    around (x,y) that goes to cache? In the 1D case, is there information on the extent of locality?

Thanks a lot!


Yes. You get the 2D locality with a 2D CUDA array and tex2D. There are also 1D arrays if you need 1D filtering.

This is not documented anywhere that I am aware. My experiences with the texture cache are that it is much too small for any kind of temporal locality: other running warps on the multiproc will flush the cache before the scheduler gets back to the first one. What matters most is getting spatially local memory accesses within each warp. For 1D caches, think of this as an efficient “almost coalesced” memory reader.

I’ve never actually tried testing the performance of the texture cache based on the width of each warp’s coverage. Maybe I’ll write a quick benchmark later today.

Thanks. What you say makes sense. I also found something interesting: I was using
2D textures, and switched to 1D linear textures, which use far less registers. As you say, if I am almost coalesced (or coalesced) without textures, then 2D textures are not required. Furthermore, if the idea is to implement a ping-pong buffer, 2D textures are pretty useless, whereas 1D textures are very efficient. Of course, I have no need for
filtering, although borders would be useful.

I am looking forward to your benchmark results if you get around to it.