Question About Memory Hierarchy

Hi everyone, In Nvidia CUDA Programming Guide I read this “Each thread has a private local memory”, mmm this is in host memory or in GPU Memory?

in this line from Nvidia CUDA Programming Guide …" Finally, all threads have access to the same global memory." want say that all threads has access to global memory only in his private memory and in your block memory, or in his private memory and in your block memory and too in and also global memory?

Local memory is stored on the device. On Fermi it’s cachable so it may be in onchip L1 or L2 but this is transparent to you.
“Local” just means it’s only accessible by the thread that owns it.

Global memory is stored in the same place and method (on the device, and cachable in Fermi), except it’s addressable by any thread.

Be careful when using your own adjectives when describing a memory type. Unlike a CPU, there are many GPU memory types and by using their exact name it makes it clear what you’re referring to. There is no such thing as “private memory” or “block memory”. The 8 types of memory in CUDA: constant, texture, host, zero-copy (mapped), global, local, shared, and registers.

The last term that is commonly but imprecisely used is “device memory” but that’s usually used to refer to global memory as opposed to host memory.

I wonder if eventually we’ll see yet another memory type (remote memory??), for addressing memory accessed via GPUdirect (letting different devices share memory or intercommunicate.)

Double post, ignore.