Keep kernel alive?

Is it possible to somehow keep an executed CUDA kernel alive after execution, so that global device memory can still be accessed?

Basically I have a kernel that returns a global memory address. However, I cant access that address after the kernel has executed, as it has been deallocated

Kernels can’t deallocate global memory. Are you talking about shared or local memory?

Interesting, I am talking about global memory. The setup is that I’m executing a PTX function on the device. It returns the address of an array, created during execution in the global memory. However, when I try to copy the data at the address from the device back to the host, after the PTX kernel has executed, I get an invalid memory location error (cant remember the exact wording, will check it out tomorrow).

Im almost completely sure the address is correct, thats why I’m guessing it has been deallocated somehow. I’ll double check that the array is in the global memory tomorrow

Are you allocating the memory using cudaMalloc or by declaring it as a device variable at global scope?

Arrays that are allocated via cudaMalloc should not be deallocated until you destroy the context (with cudaThreadExit) or call cudaFree. Static arrays in global scope should also be alive at all times, through you should be getting their address with cudaGetSymbolAddress…

Well, neither actually. I’m creating the array in PTX by declaring a .global array variable. It should work like a static array declared in CUDA C, but I can get its address with cudaGetSymbolAddress, as I’m not using CUDA C. I’m accessing CUDA through PyCUDA, a Python version of CUDA.

Turns out it was an addressing error. So you were absolutely right, global memory does not get deallocated automatically.

Thanks for the help!