How do I kill dynamically generated child kernels?

In my application, a parent kernel generates many child kernels using dynamic parallelism, some running for a long time. Inside a child kernel, I insert a trap instruction like the following when some goal is achieved, hoping that only that child kernel is killed.

if(goalAchieved == true)

I found if asm(“trap”) is executed, the subsequent child kernels cannot execute correctly.

Is there a way to elegantly kill child kernels?

if i remember correctly, trap has ‘negative’ or side-effects
perhaps consult the guides to verify that it can be used on a normal basis

but i do not follow why you wish to kill child kernels
child kernels normally kill themselves
and a child should be able to gracefully terminate on a certain condition

Wouldn’t a simple “goto done;” work, where done is a label at the end of the kernel function? This is assuming the goalAchieved can be evaluated inside your global kernel function, as opposed to a subroutine called by this kernel.

The reason I want to terminate a child kernel early enough is that it has many thread blocks. Say that thread block 0 already found the answer, all the other thread blocks have to be scheduled to SMs to execute some conditional branches and terminate. I’d like to avoid the cost.

i see

you could maintain an atomic per child kernel - each block of a particular child kernel checks its kernel’s termination flag; any block of a child kernel is also allowed to set the corresponding flag

you could also invert the kernel issuance logic
instead of issuing all blocks on a kernel basis - resulting in a queue of all blocks of some kernels - you could invert the queue by issuing some blocks of all (more) kernels
this may reduce the number of blocks to hit the embedded conditionality

i also wonder if this is not something to rather hand over to the host
the host would be able to streamline this better, in my mind