For what case should I use texture memory?

sarutake.nv · May 14, 2010, 4:59am

Hello,

For what case should I use texture memory(instead of direct global memory access)?

I’ve verified the performance of global memory access viaã€€texture memory in some cases.
However, it was slower than direct global memory access in any case.

I wonder what the case suitable for texture memory usage is.

And,
Does someone know about detailed behavior of texture caching?

I understand the texture data close 2D spatially is cached when texture memory space accessed.
But, How close?

Who shares the same texture cache?
Threads in the same warp, same block, or all threads?

Thanks in advance, and I’m sorry for my poor English.

E.D_Riedijk · May 14, 2010, 6:31am

You should use texture memory when:

you cannot coalesce your writes, but there is locality in your data access
you are accessing a 2D array, and you want to skip calculating 1D indices, or you have 2D locality
you want to interpolate your data (linear interpolation is for free)

the cache is probably shared per Texture Processing Cluster, or 2 Multiprocessors in 1.0 and 1.1 hardware, 3 Multiprocessors in 1.2 and 1.3 hardware.

_Big_Mac · May 14, 2010, 2:10pm

I never bought this. Turning two 2D indices (which you need in order to to address a 2D tex) into one 1D index is a single MAD ( i = x + y*xdim).

If you’re doing so many of those MADs that it might become measurable, you’re probably doing just as many memory reads afterwards, at which point your kernel gets bandwidth bound. Unless you’re hitting the L1 tex cache but at this point you’re benefiting from caching, not indexing.

A single MAD before a memory read is nothing.

If you need the wrapping behavior, there’s just an additional modulo. Clamping behavior - min and max functions.

Now, caching and imperfect coalescing - that’s more advantageous. Free filtering might make sense if you’re doing the bilinear kind. 1D linear filtering is probably more trouble than it’s worth (only 256 steps between values and you only save like 5 arithmetic instructions).

sarutake.nv · May 24, 2010, 10:53am

I’m sorry for my late reply.

Thanks for your answer.

So, texture cache is advantageous under any(or both) of the following two conditions :

Coalescing is imperfect
Bilinear interpolation is required

Isn’t it?

ï¼³ï½”ï½‰ï½Œï½Œ, I don’t know about detailed texture caching algorithm…

sarutake.nv · May 24, 2010, 10:53am

I’m sorry for my late reply.

Thanks for your answer.

So, texture cache is advantageous under any(or both) of the following two conditions :

Coalescing is imperfect
Bilinear interpolation is required

Isn’t it?

ï¼³ï½”ï½‰ï½Œï½Œ, I don’t know about detailed texture caching algorithm…

MisterAnderson42 · May 24, 2010, 11:24am

NVIDIA doesn’t publish the details, so all that anyone can do is guess.

The best bandwidth is delivered by the tex cache when the threads in each individual warp access values near each other in memory. The texture cache is too small to present any level of temporal locality and thread scheduling prevents spatial locality between threads in a block from contributing much.

If you want to read more of my musings on the texture cache, search the forums using google:

http://www.google.com/search?q=site%3Aforu…lient=firefox-a

MisterAnderson42 · May 24, 2010, 11:24am

NVIDIA doesn’t publish the details, so all that anyone can do is guess.

The best bandwidth is delivered by the tex cache when the threads in each individual warp access values near each other in memory. The texture cache is too small to present any level of temporal locality and thread scheduling prevents spatial locality between threads in a block from contributing much.

If you want to read more of my musings on the texture cache, search the forums using google:

http://www.google.com/search?q=site%3Aforu…lient=firefox-a

sarutake.nv · May 26, 2010, 8:24am

Thanks MisterAnderson42.

Very cool.
Your past discussions just come up to what I want to know !

sarutake.nv · May 26, 2010, 8:24am

Thanks MisterAnderson42.

Very cool.
Your past discussions just come up to what I want to know !

Topic		Replies	Views
When is it worth copying global to texture memory CUDA Programming and Performance	2	3361	July 7, 2008
When to use textures CUDA Programming and Performance	7	8126	February 12, 2008
Texture memory when to use ? CUDA Programming and Performance	6	20786	October 7, 2009
Is coalescing access important to texture memory? CUDA Programming and Performance	10	12832	March 16, 2008
Texture memory performance CUDA Programming and Performance	4	4976	June 1, 2009
Benefits of Texture Memory couldnt use them... CUDA Programming and Performance	6	3199	February 13, 2008
Texture and Global Memory CUDA Programming and Performance	2	3842	July 11, 2007
texture vs global memory CUDA Programming and Performance	0	2881	December 16, 2009
Texture cache characteristics 2D cache size CUDA Programming and Performance	5	6105	May 8, 2007
Question about texture/shared memory enhance the computing efficiency CUDA Programming and Performance	3	5383	December 4, 2007

For what case should I use texture memory?

Related topics