Hello , I am wondering if I have coalesced memory access in my code.
But first something to clarify.
If I have
myshared[threadIdx.x] = myglobal[threadIdx.x]
, ok it is coalesced.
If I have
myshared[threadIdx.x] = myglobal[threadIdx.x + some_constant_value]
, it is coalesced I think.
If I have
myshared[threadIdx.x + some_constant_value] = myglobal[threadIdx.x ]
, it is coalesced I think.
Now , If I have :
ind = ( threadIdx.y + blockIdx.y * blockDim.y ) * NumberOfCols + ( threadIdx.x + blockIdx.x * blockDim.x )
myshared[threadIdx.x] = myglobal[ind]
, I think it is ok
myshared[threadIdx.x + 1 ] = myglobal[ind]
,this?
And finally , if I have my global data and I want to access different values.
I want for example to have a pixel in the centre of a grid:
myshared[threadIdx.x +1] = myglobal[ind]
I want now to compute the left value of it,so :
Left_ind = ind - 1
myshared[threadIdx.x] = myglobal
And the right value:
Right_ind = ind + 1
myshared[threadIdx.x + 2] = myglobal
I can’t understand lastly this thing.For example here:
myshared[threadIdx.x +1] = myglobal[ind]
Since all threads are running concurrently , and myshared is going to have:
myshared[0 +1]
myshared[1 +1]
myshared[2 +1]
...
and myglobal will have all together!
myglobal[0]
myglobal[1]
myglobal[2]
....
how can I assure that myshared[0+1] will refer to myglobal[0] and not myglobal[4] for example?
Or the opposite with myshared.
Am I missing something here?
Thank you!