I have a question about the usage of bar.sync
. Why does bar.arrive a, b
need to specify the number of threads to arrive (b
)? In fact, only sync
involves waiting, while arrive
does not wait, and each bar.arrive
can only submit one arrival at a time (it’s not that b
represents the number submitted per execution). Therefore, I think the number b
in bar.arrive a, b
is redundant. (In fact, the question asker here has the same doubt as I do: StackOverflow: What does thread_count mean for bar.arrive PTX barrier synchronization instruction?)
Probably the barrier is reset, if enough threads arrived, even if non synced?
1 Like
O…K? well, frankly speaking, this seems not very necesarry~ But anyway, it is not a big problem.
Thanks!