Is there any difference in behavior of blocks Vs. threads?

Mr.UNOwen · January 13, 2012, 2:06am

Is there a difference between threads and blocks when it comes to executing instructions or is it just a physical constraint? In other words, is there a difference when doing func<<2,1>>>() and func<<1,2>>>() in performance or is it just a matter of physical limitation of x number of threads in a block and x number of blocks in total?

Mr.UNOwen · January 13, 2012, 2:06am

Is there a difference between threads and blocks when it comes to executing instructions or is it just a physical constraint? In other words, is there a difference when doing func<<2,1>>>() and func<<1,2>>>() in performance or is it just a matter of physical limitation of x number of threads in a block and x number of blocks in total?

tera · January 13, 2012, 2:50am

Yes, there are big differences. Threads of the same block execute in parallel and can exchange data through shared (or global) memory. The order of execution of threads from different blocks is undefined and they have no common shared memory space.

Check section 2.2 of the Programming Guide.

tera · January 13, 2012, 2:50am

Yes, there are big differences. Threads of the same block execute in parallel and can exchange data through shared (or global) memory. The order of execution of threads from different blocks is undefined and they have no common shared memory space.

Check section 2.2 of the Programming Guide.

Mr.UNOwen · January 13, 2012, 4:57am

Thanks, that’s useful to know. So in regards to memory space, does that include the space I malloc to the device, or does cuda just make multiple copies of the space I allocated? Any other characteristics that differ when coding?

Mr.UNOwen · January 13, 2012, 4:57am

Thanks, that’s useful to know. So in regards to memory space, does that include the space I malloc to the device, or does cuda just make multiple copies of the space I allocated? Any other characteristics that differ when coding?