Distribution Threads by the SMs

Dear All

 If I launch a kernel from host (that runs in a single SM, something like kernel<<<1,16>>>()) and then I launch another kernel from that kernel (inside the device), it will all run in the same SM, or not?

Thanks

Luis Gonçalves

The behavior of the CUDA work distributor is mostly unspecified. If you launch two kernels, each of which only consist of one threadblock, most likely those two threadblocks will execute on separate SMs, assuming they are launched and run concurrently, and assuming your GPU has 2 or more SMs. But there is no guarantee of that behavior.