concurrenct execution of kenerls fermi

Hi all,

official document announce that a fermi can allow up to 16 kernel residing on it, but does not give a clue in what form. Fermi has 16 SM, so which of the following condition really happens:
(a) each SM allows for all blocks from only one kernel;
(b) blocks from all kernels are mixed and assigned to SMs in an out-of-order manner, which means one SM might have all kernels’s blocks on it.

we can simply supposing the precondition that 15 kernels have only one block for each, and the 16th kernel have 20 blocks, what will be the distribution of blocks among SMs ?