I know that 32 threads are grouped into a warp and a half warp (16 thrds) are executed in SIMD fashion.
Question 1: Do these threads of a single warp have some pattern as in thread number 0 to 31 in warp 1… 32 to 63 in warp 2… smthing like that?
if yes then Question 2: if i have defined a 2D block size ie i have threadIdx.x and threadIdx.y then how do i calculate the 1D thread number for each thread? Will it be threadIdx.xblocksize.y +threadIdx.y OR threadIdx.yblocksize.x +threadIdx.x?
I ask this question because i am storing uchar(1byte) data in the shared memory and accessing it according to the thread ids. Now to reduce the bank conflicts i wanna program in a way that groups of 16 threads do not have any bank conflicts thus i must know which threads will be grouped together in warps