Shared Memory Help needed

Hello experts,

I want to know that if there is any method to retain and reuse the shred memory after kernel completion in the next call.
i.e., if we launch a kernel and allocate some shared memory to it.
Again we call the same kernel. Can we reuse that shared memory without recopying from global memory.

Is there any other trick of doing the same.

:thanks: in advance.


At least not officially. You might be lucky if you join at most as many blocks as can run concurrently in one wave, that the contents are still there on the next invocation of the same kernel. There are no guarantees however (not even that the same block ends up in the same shared memory area on the same SM).

I haven’t heard of any experiments in this direction either (although they would be simple to carry out). I’m pretty sure experiments have a higher chance of success on compute capability 1.x devices than on Fermi.

The fact that nobody reports about this might indicate it’s just not worth doing. The real answer to your problem probably is to do more work per block to amortize the fetching from global memory.