Nestled Kernel How to launch a kernel from another kernel

Hi all,

I’ve try to look up if the subject had already been solved on the forum, but the search function didn’t find any similar topics.
I’m currently trying to adapt an algorithm to CUDA. This algorithm is composed of parts which are highly parallel.
The pb is that those chunks of program are nestled.

Qu : Is it possible to call a kernel from the inside of another kernel.

I’ve tried to do it myself first, and ended up with the following error : “call cannot be configured.”
I think it might be tricky to do this, since the grid and block parameters have to be redefined.



Its not possible to call a kernel inside of a kernel.

However, what you can do is return from the kernel (kernel1), save the kernel1 parameters onto a stack, launch kernel2, get results from it, then resume processing on kernel1 by recalling it in the main code.

Hope that made sense!


The GPU architecture is not made to handle nested kernel calls. Thus you cannot make one kernel call another kernel. Also a kernel is invoked only by the CPU and CPU does it only after execution of one kernel is over.

What you can probably do is :

  1. Use device memory to store results of first kernel and then read it back in next kernel.
  2. twist the algorithm in such a way that the whole task can be done in one kernel. That can be possible if the (supposedly) 2nd kernel doesnt need result of other blocks other than the block in which it is running. then u can use shared memry.