cudaThreadExit

In some test code I wrote, I noticed that cudaThreadExit totally reinitializes the CUDA runtime. Unfortunately I found that out after reading the documentation more closely–of course, after wasting a bit of time chasing down a “bug” 8(. After calling cudaThreadExit, all previously allocated memory is freed. Now the reason I called cudaThreadExit anyways was to clear a trashed thread that dereferenced a host pointer (got stuck in the “Unknown error” state), and recover. But, I could see it happening also for deadlocked threads, or threads taking just too long. Is there any way to kill a kernel without totally trashing the rest of the CUDA runtime?