Hi everyone,
I’m wondering if there’s any significant difference between mbarrier
and barrier
in PTX? They seem interchangeable to me—are they actually mixable?
Also, according to the PTX documentation, mbarrier
and bulk_group
appear to be distinct concepts. However, I noticed that using bulk_group
followed by an arrive
on an mbarrier
still works as expected. Does this mean these two can also be used together without issues?
Looking forward to your insights!
They are certainly different and not mixable. mbarrier
uses shared memory to store state while barrier
uses “barrier resources” which are much more limited in number (but maybe faster to access). But there are also differences in how they can be used. E.g. one can arrive at an mbarrier
, do some other work and later wait on it. With barrier
one can only arrive or wait (sync
), so you can’t use it to hide waiting time. barrier
also seems to work at warp level and not at thread level. There are many more differences if you read through the docs.
1 Like