How to efficiently use shared memory?

blade613x · September 28, 2015, 11:16pm

Edit: Figured this one out and was fussing over something that is taken care of on the driver level. Removed because I don’t want to potentially confuse others.

little_jimmy · September 29, 2015, 5:15am

i think the first assumption is that shared memory would be the optimal solution
to what extent this is true, i do not know

you also seem to what to share shared memory between kernel blocks

what is the relationship between the complete 1024x1024 matrix, and the 32x32 blocks?
is there some kind of data reuse between/ across 32x32 blocks, hence the reason why you wish to (re)use shared memory, and also across kernel blocks

you could have blocks loop, based on a global memory atomic, and have the blocks increment their indices internally, based on the atomic
this way, a block essentially ‘becomes’ or ‘behaves as’ many blocks, and you can ‘share’ shared memory across blocks (as you never really change the block, only the block addressing)

inJeans · September 29, 2015, 7:30am

You should check out Mark Harris’ blog post [url]http://devblogs.nvidia.com/parallelforall/using-shared-memory-cuda-cc/[/url]

I think that will answer all of your questions

Topic		Replies	Views
Shared memory: released when unneded? CUDA Programming and Performance	4	3255	July 25, 2008
CUDA: Using shared memory between different kernels.. CUDA Programming and Performance	4	16602	July 21, 2017
Use shared memory in chunks CUDA Programming and Performance	5	801	December 20, 2018
optimization shared memory fail major speed using shared memory in detriment of global memory CUDA Programming and Performance	3	3742	March 31, 2011
Shared Memory Help needed CUDA Programming and Performance	1	733	March 25, 2011
How to set up shared memory allocated per block for a 3D structured data? CUDA Programming and Performance cuda	9	251	May 15, 2025
use of shared memory CUDA Programming and Performance	2	1084	February 16, 2011
Execution Of Thread-Blocks CUDA Programming and Performance	4	5376	June 18, 2007
How many times does a value need to be reused before its worth putting into shared memory? CUDA Programming and Performance	2	201	July 14, 2024
Problems when using shared memory CUDA Programming and Performance cuda	1	424	May 15, 2024

How to efficiently use shared memory?

Related topics