Shape of 2D warp Which threads are in a 2D warp?

One of the optimization strategies described in the reduction example completely unrolls the last 32 threads (i.e. the last warp) since they are SIMD synchronous. However, this example is using a 1D block of threads.

Question: for a 2D block of threads, which threads are in the same warp? If tidx = threadIdx.x and tidy = threadIdx.y, which ones are in the same warp and therefore don’t need the __syncthreads() ?

See section 2.1 - “Thread Hierarchy” in the programming guide (2.0).

Thanks for the pointer. That was the missing link. If I combine that section with section 3.1 (A Set of SIMT Multiprocessors with On-Chip Shared Memory), then I my answer is complete.

No problem. The programming guide usually has all the information there, it just isn’t all where you might expect it to be.