how do I work with shared memory?

I was wondering what is the correct way to work with shared memory systems under OpenCL. I’m talking both GPUs using shared memory and OpenCL under the CPU (AMD or Intel drivers at the moment). I want to use a single buffer, rather than copy the data constantly back and fourth. I’m guessing that the correct approach is using mapped memory, but it’s not completely clear what is the correct approach to allocated and exactly when to map/unmap the buffer from the documentation.

As far as I understand, I should allocate the buffers with OpenCL, pass the pointer on to the application and then when I need to run OpenCL stuff on the data, to map it to the device (or unmap the pointer ?).

If I unmap and remap the pointer, would I still get the same pointer back when I remap it?

Also, is there a way to do this while avoiding putting the buffer in paged locked memory?

Thanks