Question about textures

Hi everyone,

I have a couple of questions about textures.

  1. Is reading from a texture faster than reading from global memory? For instance, if I have to read from large arrays over and over again from each thread, is it best to put them in a texture?

  2. Is writing to a texture faster than writing to global memory?

  3. Does anyone have a basic example (either on the web or in the SDK) that shows how to use textures for array access?

Thanks for any help, and I apologize if i’m not making sense, i did not find the manual’s discussion on textures to be particularly helpful. Thanks again.

Generally speaking, no, textures are not faster than global memory if your access pattern is coalesced. Textures are faster if you need to read elements in an uncoalesced, but spatially local, way. For example, if your thread is going to loop over a small square region inside a larger array, a texture can help. Textures won’t help you if you are accessing memory with a large stride, like looping through a big row-major array in column-major order.

That said, using a texture is a standard workaround for reading float4 arrays. For some reason, coalesced reads from float4 arrays achieve a little more than half the memory bandwidth of coalesced reads from float and float2 arrays. But if you read your 1D float4 array through a texture reference, then you can get full performance again.

Textures references can be bound to linear memory (that is, global memory that you allocated with cudaMalloc), or cudaArrays. You can write to linear memory from a kernel, although texture cache coherency is not preserved within a single kernel call. (i.e., if the address you wrote to was already in the cache, and you read the texture again, you’ll get the old value) Since you are just writing to global memory, it’s the same speed as any other global memory write.

You cannot write to cudaArrays (which are required for 2D or 3D textures) because they pack data into memory in some special format, which NVIDIA doesn’t document.

simpleTexture shows the basic texture usage, and Appendix D.3 shows how to index the array for table lookup. Unfortunately, you’ll still have to fiddle a bit to put the two together. I haven’t found any other simple examples of using textures like an array.

Just as a side-note. As far as I remember, I have only see NVIDIA mention the fact you cannot write to textures yet. So this might be possible in the future maybe.

I hope it is possible to write to cudaarray.

Hi guys, thanks for the advice. From the comments it seems that reading info from a texture is generally no different than using global memory. So what’s the point of it then? Thanks, I appreciate all of your help.

The spatially local, but uncoalesced memory read case I mentioned is the main use case. That comes up more often than you might expect.

The other reason you might use a texture is to get linear interpolation in 1D, 2D and (now with CUDA 2.0) 3D done for you in hardware.