Trying to determine which warp a given thread falls in. Neither blockDim.x nor blockDim.y is necessarily a multiple of 32.
Will it be based something like:
((threadIdx.y * blockDim.x) + threadIdx.x) >> 5 ?
The use case is one of where each thread is to signal whether it has modified something. The warp leaders(!) get the result and conclude if the whole block has modified something…
Thanks in advance.
For purposes of assignment to a warp, the threads have a thread ID that is defined in the programming guide:
The thread ID will be assigned based on incrementing threadIdx.x first, then threadIdx.y, then threadIdx.z
Once the thread ID is determined, threads with ID 0-31 belong to the first warp, 32-63 belong to the second warp, etc.