Intructions in a half warp

Hello,

I have a little question :

in a kernel, I have :

T[idx1]Â = T[idx2]
Where T is an array of shared memory, and idx1 and idx2 fall into the same half warp (but not in the same warp)
(idx1 and idx2 could be different or not, and there may be bank conflict)
Is it safe, since the source and destination array are the same ?

That statement doesn’t make any sense. If the threads idx1 and idx2 are in the same half-warp, then they are in the same warp. If they are not in the same warp, they cannot be in the same half-warp.

That statement doesn’t make any sense. If the threads idx1 and idx2 are in the same half-warp, then they are in the same warp. If they are not in the same warp, they cannot be in the same half-warp.

Sorry, I was totally confused when I wrote this. I just wanted to mean that T[idx1] and T[idx2] fall in the same half warp (and I tried to emphasize the fact that they didn’t fall in different half warps in a given warp :geek: )

EDIT : and I realize that “fall in the same half warp” doesn’t make any sense too

The thing to understand is that

T[idx1] = T[idx2]

is an instruction executed by all threads of the grid, with :

T is written by one thread in the grid for all possible x (ie : the value of idx1 is unique in the whole grid)

T[y] is read by 0, 1, or more threads and only in a half warp (ie : the value of idx2 may be the same in different threads in a half warp)

Sorry, I was totally confused when I wrote this. I just wanted to mean that T[idx1] and T[idx2] fall in the same half warp (and I tried to emphasize the fact that they didn’t fall in different half warps in a given warp :geek: )

EDIT : and I realize that “fall in the same half warp” doesn’t make any sense too

The thing to understand is that

T[idx1] = T[idx2]

is an instruction executed by all threads of the grid, with :

T is written by one thread in the grid for all possible x (ie : the value of idx1 is unique in the whole grid)

T[y] is read by 0, 1, or more threads and only in a half warp (ie : the value of idx2 may be the same in different threads in a half warp)

If T is in shared memory, you only need to consider block level mechanics, because shared memory scope is limited to per block allocations. Even inside a single warp or half warp there will be no guarantees that operation can be safe from read after write problems.

If T is in shared memory, you only need to consider block level mechanics, because shared memory scope is limited to per block allocations. Even inside a single warp or half warp there will be no guarantees that operation can be safe from read after write problems.

Hmm, thank you… but I’m now confused. I think forgot again a crucial information :

The half warp that handles a T[idx2] read will also handle the only T[idx1] write where idx1 = idx2.

I’m really sorry for my confuseness…

Hmm, thank you… but I’m now confused. I think forgot again a crucial information :

The half warp that handles a T[idx2] read will also handle the only T[idx1] write where idx1 = idx2.

I’m really sorry for my confuseness…