Local Memory - What is that? Memory Hierarchies

wumpus · December 6, 2007, 2:42pm

Yes, I know about the PTX_ISA PDF… It is not meant as any hardware description. In one of the first pages, it already talks about a ‘virtual machine’. It is meant as a generic description of current and future NVidia computing devices. Did you notice it contains more things that aren’t actually implemented? One example is the .surface memory space. AFAIK, it does not exist for G80.

Any of the real hardware descriptions (like in the CUDA developer guide) does not mention local memory cache. So you cannot assume local memory is actually cached. Some experiments and timings have also shown that local memory is slow. Also, explicitly making things local was deprecated in 1.0. Try to stay clear from it as much as possible.

Sarnath · December 6, 2007, 2:54pm

Yes, on the same lines – One cannot say that it is NOT cached OR that it resides in Global Memory.

Sure. It could be that since this memory is per-thread in nature – it would cause the WARP to do lock-step execution when they are acccessed. Thus one memory access that usually completes or stalls in one clock cycle (depending on the kind of memory) for the entire WARP , now occurs in lock-step fashion (completes or stalls) which can drastically slow down performance.

If some1 from NVIDIA talks about it – it would be great!

wumpus · December 6, 2007, 2:57pm

It does reside in global memory, I’m sure of that much. Then again, constant data and shader code also reside in global memory, that fact doesn’t tell anything about the caching scheme, that’s true.

Sarnath · December 6, 2007, 2:58pm

Whats do you base your claim on?

wumpus · December 6, 2007, 3:04pm

By dumping the GPU memory. You can find the code and constants for all the kernels by reading the right (global memory) offsets, in a kernel.

Sarnath · December 6, 2007, 3:08pm

Aah. Thats pretty interesting. So, I assume you did and found that out! Hmm… That sounds coool!

DenisR · December 6, 2007, 6:13pm

Check the link in his signature! External Media

Topic		Replies	Views
Local memory performance Using more than 4kb kills it.. why? CUDA Programming and Performance	24	5300	September 6, 2008
Slow local memory, feigned constant memory. coalesced? global? CUDA Programming and Performance	29	7404	January 25, 2010
Local memory? CUDA Programming and Performance	6	5135	April 23, 2007
Help me to understand Global vs Local Memory performance. CUDA Programming and Performance	19	24937	December 21, 2009
How fast is local memory? the doc doesn't say much CUDA Programming and Performance	24	8366	August 20, 2007
basic doubts about cuda CUDA Programming and Performance	9	3846	February 7, 2008
how to know what variables are placed in local memory? CUDA Programming and Performance	9	5484	January 29, 2010
About the different memories CUDA Programming and Performance	12	11891	December 6, 2007
why my kernel uses local memory? CUDA Programming and Performance	9	3756	August 21, 2015
Local faster than global. Why? CUDA Programming and Performance	15	13049	March 20, 2009

Local Memory - What is that? Memory Hierarchies

Related topics