Am I right that despite the illustrations in the guide, all the cuda multiprocessors have access to the full range of constant memory, even though they don’t have access to one another’s shared memory?
Also, it is not entirely clear how pointers work with so many different memory spaces involved (constant, texture, each multiprocessor’s “shared”, global).
In which of these spaces can we construct a pointer that points to a location on which others. For instance I know we can’t have a pointer in one multiprocessors shared memory which
points to a memory location in another multiprocessors shared memory, and I think(?) that we can have a pointer in either of constant, global point to a location in either constant or global but am not sure.
1-Any Scalar Processor, and any thread have access to the whole constant memory, as well as texture memory
2-Constant, Texture, Global and “Local” memory are the same videocard memory space, with different access primitive, rights and/or caching (ie: constants and texture are cached, constants could not be overwritten anyway, textures may be but cache won’t snoop modification in real time, Global isn’t cached at all, …)
So there’s 3 different Spaces: Registers (that may include non-dynamically indexed arrays) that is private to thread, 8192 or 16384 (G200 processors: GTX260+) 32-bits registers divided by the number of threads into your Multi Processor, Shared Memory (16KB per Multi Processor) that you may share amongst threads running on the same Multi Processor, and all the others that physically reside on videocard main memory.
And so there’s two kind of pointers, Shared Memory Pointers and Global Memory Pointers.
Shared Memory Pointers could ONLY points to shared memory of the Multi Processor that is executing the threads, and there’s no way to exchange directly Shared Memory data between two Multi Processors without resorting to main memory (Global Memory) IO.