Copy data into shared memory

_Tom · May 27, 2009, 6:16pm

Hi,
i was thinking that it would be really useful to be able to copy something into the shared memory before kernel execution, like from CPU or from another kernel…
to avoid performing a gst/gld when i know i will need those values just after.

Is there a way to allocate and copy data into a “static” portion of the shared memory, so that it doesn’t get deleted when i call a new Kernel?

Thanks!

sergeyn · May 27, 2009, 7:48pm

to me it sounds like what you want is actually constant memory type. constant data is preserved across kernel invocations and is as fast as shared memory if your threads read data in a coherent way.

_Tom · May 27, 2009, 8:12pm

Hmm, yes, but only partially:

"[i]constant variables cannot be assigned to from the device, only from the

host through host runtime functions[/i]"

So i can’t use it to store the results of a kernel for another one… a thing that could be really interesting performance-wise.

Also, what is the size of constant memory? I can’t find it anywhere.

sergeyn · May 27, 2009, 8:23pm

what you could try is to read uninitialized data inside the kernel after you’ve executed another kernel which wrote into that shared memory area.
I doubt that the whole shared memory block is cleared before each kernel invocation, so this might actually work if you keep shared memory size per treadblock the same (and same set of input arguments). Also you do not have guarantee, that all shared memory will be touched by your ‘writing’ kernel, so you’d probably want to take that into account.

If you are going to experiment - write about your findings here.

size of constant memory is 64kb

sergeyn · May 27, 2009, 8:25pm

And also you need an exclusive access to the device to make that work, since your algorithm most probably will not like if someone else kicks in and uses the device to render gui for instance.

_Tom · May 27, 2009, 8:34pm

yes i think that this would be really an hack, even if i could get it to work it won’t for sure work reliably, so i don’t think i will even try.

BTW it would be useful if cuda allowed for a “Kernel Chain” where each kernel keeps the shared memory of the preceding… but maybe it would have too many limitations (es: block number)

Cygnus_X1 · May 28, 2009, 1:34pm

Be warned about constant memory, it is not exactly as fast as shared!

Quote from Programing Guide:

Topic		Replies	Views
Constant Memory Allocation __constant__ memory runtime allocation? CUDA Programming and Performance	1	2565	July 31, 2007
Non Sequential Copy to Shared Memory CUDA Programming and Performance	5	1124	May 25, 2011
Advice sought on data transfers between memory CUDA Programming and Performance	0	884	September 5, 2008
Can I allocate constant memory dynamically? CUDA Programming and Performance	5	1851	June 24, 2009
shared vs. const memory very simple question regarding performance CUDA Programming and Performance	1	3020	October 1, 2011
cache data in shared memory for subsequent calls CUDA Programming and Performance	4	4041	May 25, 2010
passing arguments with constant memory CUDA Programming and Performance	4	3277	July 15, 2009
constant and shared memoty CUDA Programming and Performance	8	9225	December 12, 2008
Question about constant device memory CUDA Programming and Performance	2	990	September 1, 2010
constant vs shared memory CUDA Programming and Performance	2	23433	February 23, 2007

Copy data into shared memory

Related topics