Blocks run in sequential or parallel?

Hi people, as the topic title, I want know if the blocks of a kernel call run in sequential or in parallel? Example:

kernel<<<dim3(65535,4,1),dim3(32,16,1)>>>(…);

the 65535*4 blocks run one at time (and the threads of chosen run in parallel) or all the blocks (or a number of their) run in parallel (and then all their threads) ? Thanks!

Hello,

The order of execution is not predefined. At a time the gpu will run at least as many blocks as SMP are i nthe card (in the case of Tesla 2050 14). Each SMP could run up to 8 blocks at a time.

This SMP runs 8 blocks at a time, then do the threads (in this SMP) run in parellel?

Kind of. If a thread wants some data from the global memory of the gpu it issues the call and then waits. In this time other threads are issuing requests to the global memory. In a way it is safer to assume the threads are running all in parallel in a block.