Does shared memory(local memory for the work group) survive for the next iteration of the same kerne

Hi,

I am using a gpu with the below configuration:
CL_DEVICE_NAME: Tesla M2050
CL_DEVICE_VENDOR: NVIDIA Corporation
CL_DRIVER_VERSION: 285.05.33
CL_DEVICE_VERSION: OpenCL 1.1 CUDA
CL_DEVICE_OPENCL_C_VERSION: OpenCL C 1.1
CL_DEVICE_TYPE: CL_DEVICE_TYPE_GPU

And I wanted to know if the shared memory for a work group survives for the next iteration of the same kernel.

For example,

If the kernel1 uses a local memory for example local1 and if I write some data in to it for the first iteration. Would the data of the local memory be still there for the next iterations ? or is it initialized again for each kernel invocation ?

for (ii=0;ii<100;ii++) {
		err = 0;
		err  = clEnqueueNDRangeKernel(commands, kernel1, 1, NULL, globalws_2d_1, localws_2d_1, 0, NULL, NULL);
}

I hope my question was clear, if not pls let me know.

Regards,

Hi
Shared memory is stored on the ‘shared-memory’ HW. Each SM (in Fermi) or SMX (in Kepler) has its own shared-memory that is used by the thread-blocks that run on it. Once you re-run a kernel, it is not guaranteed that a specific block will run on the same SM as before, so I can’t see how it can use the shared-memory from the previous run.

So my guess is that the shared memory does not survive.

Regards

shared memory only survives between work-groups. once your work-group execution has completed, there’s no guarantee that it will survive. However, there’s a way to copy data from global memory to local memory before your kernel execution begins, and push it back afterwards:

event_t async_work_group_copy ( __local gentype *dst,
const __global gentype *src,
size_t num_gentypes,
event_t event)
event_t async_work_group_copy ( __global gentype *dst,
const __local gentype *src,
size_t num_gentypes,
event_t event)

more info : http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/

So, my advice would be: allocate global memory to store your data, copy it at the beginning of your kernel to local memory, and write your local memory back to global memory at the end