About texture cache and spatial locality

Hello up there at Michigan Tech. I got my undergrad degree there :)

There wouldn’t be much of a point to 3D textures if they didn’t offer 3D spatial locality in there accesses. This has been mentioned somewhat officially in some forum posts before, but not recently. You could always write a microbenchmark to verify. Just a simple kernel to copy data out of a 3D texture into global memory. Run it where thread read in each of the 3 directions and compare the timings.

And I’ll point out to clarify that only textures in 2D cudaArrays are laid out for 2D spatial locality. Textures bound to global memory and read with tex1Dfetch only have 1D locality.