maintaining contents of cache across kernel launches

For my problem it would be convenient to run a kernel on some data, get some results and then return control to the CPU to allocate some memory on the GPU with the amount being dependent on the results. I would then like to be able to return control to the GPU with the contents of the cache (shared memory if you will) being intact and undisturbed.

Is this possible?

You’d do something like:

  1. Allocate memory on device

  2. Call Kernel, providing input data and output pointer

  3. Memcpy results back

  4. Allocate GPU memory based on result

  5. Call kernel using output pointer from 2 as input data

Shared memory is not maintained between kernel launches but global memory is.

The second kernel will only be able to see what the first kernel wrote if you use global memory.

Moreover, the size of the shared memory and texture caches is so small that unless your kernel is extremely short, the time required to refill the caches at kernel startup is probably negligible.