Confused about linear and texture (array) memory

maolimu · July 5, 2008, 1:53pm

Hi all,
I’m a bit confused about linear, array and texture memory. I’m using a GT 8800 that has 512 MB total RAM.

Does the card statically reserve some of it’s RAM for the different types of memory?

Why is there a need to copy the result of a kenel (linear global memory) to array memory so that the array can then be used to create a texture for texturing.

Does texturing copy the array memory once again to some special place in GPU memory where texturing can be used or does texturing directly use the array memory?

If all of these memory types are on the same GPU memory, why is there a need at all to copy from linear to array memory to use texturing - why can’t we directly use the output of a kernel as input texture for another kernel?

Can someone give me some insight about why things are that way.

Mark

seibert · July 5, 2008, 8:14pm

The main difference between global memory, texture memory, and constant memory is the path used to access it. All three of these types of memory live in the off-GPU, but on-board, DDR3 chips in your graphics card. There is no physical partitioning of your 512 MB of memory between global, texture and constant memory. (There is a limit of 64 KB of data in constant memory, but I don’t know if this is a physical limit, or some kind of addressing limit in the way constant memory is cached.)

What is different between the three is the hardware used to access it:

Global memory reads go direct to the memory controller, with no intervening cache. This is what happens when you do normal C-style reads from arrays allocated with cudaMalloc().
Constant memory reads go through the 8 kB per multiprocessor constant cache. If the word is found in the cache, it is immediately returned from there. Otherwise, the constant cache fetches the required data from global memory.
Texture memory reads go through the 6-8 kB per multiprocessor (size depends on specific GPU) texture cache. If the word is found in the cache, it is returned immediately, otherwise the texture cache fetches the required words from global memory.

The texture cache also has some other features. It does low-precision interpolation between array elements “for free”, and also understands how to read a specially packed multi-dimensional array format. This why 2D textures must be loaded onto the card using “CUDA Arrays” which are not organized in memory using the normal linear layout. Instead the elements are arranged into something like a Z-curve:

[url=“Z-order curve - Wikipedia”]http://en.wikipedia.org/wiki/Z-order_(curve)[/url]

With the exception of 1D textures, the texture cache expects the data layout to be in this special CUDA Array order. The array copy you mention is just rearranging the elements to be in this order. It isn’t actually moving the data to a special segment of “texture memory.”

Topic		Replies	Views
basic doubts about cuda CUDA Programming and Performance	9	3794	February 7, 2008
Really slow constant memory Random access to constant memory CUDA Programming and Performance	13	4486	December 4, 2009
Constant memory CUDA Programming and Performance	7	1234	August 11, 2011
Division of texture/constant/global memory is logical or physical? CUDA Programming and Performance	1	1225	May 6, 2009
Texture Memory ! CUDA Programming and Performance	3	7195	January 11, 2010
dumb, newbie CUDA texture question CUDA Programming and Performance	2	27980	June 29, 2007
Memory types and CUDA access CUDA Programming and Performance	5	59285	February 3, 2009
Texture vs. Global Memory CUDA Programming and Performance	4	2027	August 6, 2009
memory organization CUDA Programming and Performance	3	4349	March 10, 2008
Constant Arrays CUDA Programming and Performance	13	30607	November 24, 2007

Confused about linear and texture (array) memory

Related topics