Question about local memory

I’m wondering whether local memory is cached…
I found “The local and global memory spaces are read-write regions of device memory and
are not cached.” on page 73, NVIDIA_ProgrammingGuide_2.3.

I also found "The local state space (.local) is private memory for each thread to keep its own data. It is
typically standard memory with cache. " on page 23, ptx_isa_1.4.

Is my understanding correct? So is local memory cached at earth?


This is contradictory to everything that I have read. I was under the impression that local memory was simply mapped into global memory with a per-thread offset. I would think that it would only be cached if global memory was cached (it isn’t in tesla but is supposed to be in fermi).

One more edit: The semantics of local memory make it easier to cache than global memory. It is private to each thread so there is no need at all for cache coherence, flushes, etc.

This also contradicts memory benchmarks I recall seeing long ago which showed that local memory reads were only as fast as coalesced global memory reads.

MAYBE nvidia guys… slipped in FERMI capability information there :teehee:

Thanks for your benchmarks~ :rolleyes: