launch another kernel from a kernel on the GPU itself

Hi,

i have a kernel that does calculations without syncthreads() call. Now, when the last thread finishes, I want to launch another kernel from the GPU itself, to do another different process on the data, but without returning the control back to the CPU (because i guess i will lose a lot of time if i do it). Is this possible to do? the data is already on the GPU, so I just need to relaunch another code on it. How is this done? (without syncthreads() call)

Thanks in advance.

You cannot call kernel from another kernel, however you won’t loose much time on exiting and reentering another kernel.
Just call kernles in sequence on cpu side.

Note, kernel calls are asynchronous, but GPU can process only one kernel at the time, so kernels will be scheduled in the order you call them. I believe it can efficiently hide any call overhead it might exist…

excellent!

thanks a lot

If you know you are going to run two kernels in a row, you can call them both from the host (with no synchronization in between), and the driver will queue them up for sequential execution. You are guaranteed that the memory writes from the first kernel will finish before the second kernel starts.