Parallel Processing - NOT why?

Oops I made a boneheaded mistake:( I ran the SetDevice command before firing up the threads. So all the threads were using the same GPU. My bad.

It may be that your timings are like this because of the initialization time - this issue is discussed several times here on forum, check for example this thread.