Dynamic Parallelism

Dear all

See the below example:

*************************Parent Kernel
global void parent(…)
{
int i=threadIdx.x;

 if (i==0)
    children<<<1024,128>>>(.................);

}
**************************call parent
parent<<<1,128>>>(…);


My doubt is (perhaps a dummy question)

If the only block of “parent” runs in SMX zero the multiple children blocks will run in SMX 0 or will be distributed for all SMXs?

Thanks

Luis Gonçalves

All child kernel launches are just like host (parent) kernel launches in this respect - they will utilize all available resources and are not limited to a single SM, regardless of the behavior of the parent kernel.

This is easy to prove with a bit of microbenchmark-style coding - have each block of the child kernel record which SM it is running on. Not sure how to do that? Search this forum for smid - you will find examples of function usable in device code that will tell you which SM you are executing on.