Synchronizing specific threads in cooperative groups


Is it possible to synchronize only some specific threads inside a grid (or multi grid) cooperative group? For instance, I would like only threads with ranks A,B and C to issue a sync(), while letting the others through. Does Cuda 10 support that?

The grid-wide sync mechanism only supports synchronization of all threads, not some subset. Therefore this is not possible if the threads are in separate threadblocks.

If the threads you want synchronize are in the same threadblock, you can do that selectively, to some degree, by organizing various groups (basically tiles) within that block using cooperative groups, and the synchronizing that group/tile.

This still doesn’t allow arbitrary synchronization of threads whose threadIdx.x are, say, 12 and 472. There is no way to create an intra-block group that consists of just those 2 threads.