Shared memory : shared access

BarsMonster · July 21, 2008, 6:42am

Is that possible to access the same portion of shared memory from all threads on particular multiprocessor?

Actually I just need more constant space, so I am wondering if it is possible to use shared memory in a similar way.

JoeCollege · July 21, 2008, 7:12am

I’m not sure, but i think that shared mem has whats called broadcast mode. download the program guide from the docs page and look in there.

[url=“http://www.nvidia.com/object/cuda_develop.html”]http://www.nvidia.com/object/cuda_develop.html[/url]

BenW · July 21, 2008, 7:07pm

I believe that shared memory is local to a block, and a multiprocessor has between 1 and 8 blocks running at any given time, each with its own “piece” of the shared memory (the 16k gets divvied up among the blocks, so if each block wants 16k of shared memory, then the multiprocessor is forced to run them one at a time). However, there is (to my knowledge) no mechanism for a block to read from shared memory not exclusively allocated to it.

So my best answer is this: “sort of”. If the multiprocessor is forced to run one block at a time by setting the cache to 16k or maxing out the register count (not recommended…that means global writes and other high-latency operations can’t be hidden by other blocks, among other bad things), then yes, all threads on the multiprocessor, being in the same block, can access all the shared memory on the multiprocessor that has been allocated to that block. However, if there are more than one blocks running simultaneously, they cannot cross-share shared memory. (does that make sense?)

BTW: “Broadcast” mode is a device-level speedup for reducing bank conflicts when threads from within the same block all try to read from the same location in shared memory at the same time. It does not (to my knowledge) allow shared memory to cross-talk between blocks.

Good luck!

Ben

tmurray · July 21, 2008, 9:14pm

There is no mechanism for shared memory across blocks. One potential solution: run bigger blocks. (Blocks of blocks? That seems like a bad idea for a lot of reasons…)

_Big_Mac · July 21, 2008, 9:25pm

If you need an emulation of constant memory, you might try textures. They get cached AFAIR so they’re a step better than global memory in that aspect. They are also readable across the whole grid.

Topic		Replies	Views
Shared memory per block Related to shared memory of an MCPU CUDA Programming and Performance	3	3986	August 14, 2007
Execution Of Thread-Blocks CUDA Programming and Performance	4	5282	June 18, 2007
Shared Memory Problem memory shared only within blocks? CUDA Programming and Performance	4	5945	February 8, 2008
Use shared Memory CUDA Programming and Performance	3	432	December 26, 2019
Not enough shared mem CUDA Programming and Performance	5	5765	November 3, 2009
shared memory example CUDA Programming and Performance	1	5265	February 24, 2010
shared memory and syncthreads question CUDA Programming and Performance	2	1211	March 3, 2009
Shared memory Is it context switched? CUDA Programming and Performance	9	11279	December 6, 2007
shared memory allocation among thread blocks CUDA Programming and Performance	3	1845	March 3, 2008
shared memory usage per Block VS per SM CUDA Programming and Performance	2	8544	May 3, 2010

Shared memory : shared access

Related topics