I have a question regarding how to compare 2 buffers avoiding serialization.
I have a two dimension integer buffer that has aproximatly 30000000x5.
Like in: int Buf1[3000000][5] (Forget about how it is allocated is just to clarify)
The Second buffer is just a pointer to 5 integers like in: int Buf2[5]
I have to be able to compare the 5 integers in Buf2 with the complete buffer in Buf1.
My problem is the next, if a create a kernel with two buffers, I know that I will have no trouble reading each one of the sub buffers in Buf1 with each thread.
1 - But what will happens with the second buffer?.
2 - Every thread will read from the same address?.
3 - Is this going to produce a serialization between all threads?
4- If this happens, do you have any idea how to avoid this?
Absolutely no problem having multiple threads in a warp reading from the same address.
Even better have thread zero read Buf2 into shared memory, then have all threads read it from there, this will be a ‘broadcast’. Should be useful to do especially if your blocksize is more than one warp.
Yes
NB if your block is 2D then also test threadId.y == 0
mmm, sorry, if Buf2 has 5 elements I would normally use the first 5 threads to each copy one element to the shared array. e.g.
if ( threadIdx.x < 5 )
{
sharedBuf2[ threadIdx.x ] = globalBuf2[ threadIdx.x ];
}
__syncthreads();