Understanding bar.sync and the Role of thread_count in bar.arrive

I have a question about the usage of bar.sync. Why does bar.arrive a, b need to specify the number of threads to arrive (b)? In fact, only sync involves waiting, while arrive does not wait, and each bar.arrive can only submit one arrival at a time (it’s not that b represents the number submitted per execution). Therefore, I think the number b in bar.arrive a, b is redundant. (In fact, the question asker here has the same doubt as I do: StackOverflow: What does thread_count mean for bar.arrive PTX barrier synchronization instruction?)

Probably the barrier is reset, if enough threads arrived, even if non synced?

1 Like

O…K? well, frankly speaking, this seems not very necesarry~ But anyway, it is not a big problem.

Thanks!