Workgroup load balancing

Hi,

if I have a bunch of workgroups with different execution times and on one multiprocessor there are 3 executed at one moment, when one of them finishes, is another one picked from the pool immediately or is the multiprocessor stalled with the two others and loading another set of (3) workgroups is deferred until all currently executed workgroups finish? I hope for the first alternative, of course, but I am not sure.

Thanks

Flavius

I think there has been a debate on this in the forum before. Some experiments were done and it was found that the MP is stalled ( That’s why it’s good to have workgroups / blocks with similar computation load) … Can somewhere share some useful pointers…

Things may have changed with Fermi though!

Isn’t this the advantage of having concurrent kernel execution?

There has been a ton of work done on work load balancing.

Isn’t this the advantage of having concurrent kernel execution?

There has been a ton of work done on work load balancing.