__syncthreads() in divergent branches by giving an argument to bar.sync in PTX

cbuchner1 · July 20, 2008, 8:34pm

We have been told that using __syncthread() in divergent code is dangerous because not every thread reaches the corresponding barrier sync point, potentially leading to deadlocks.

The PTX ISA manual V1.2 states that the bar.sync instruction has an argument, specifying some sort of identifier of the barrier synchronization.

Can I use this to achieve reliable barrier synchronization in if/else statements?
Assuming I have control over the ID passed to the bar.sync command in the PTX, would the following pseudo-code work?

Sync to Barrier # 0

if ( some condition )
{
Do something

Sync to Barrier #1

Do some more

}
else
{
Do something else

Sync to Barrier #1

Do more of this

}

Sync to Barrier # 2

My point is that I would use the same ID in both divergent code branches, allowing all threads to eventually sync on the same barrier ID - no matter which branch they are in. If this works in principle, couldn’t the compiler try to emit matching IDs on its own?

Christian

gatoatigrado · July 21, 2008, 1:53am

Interesting, though unless you’re incredibly concerned about efficiency, you could just cache the result for the if statement in some shared variable, and repeat the if/else branch after the intermediate __syncthreads(). I think this is easier to read as well.

Sarnath · July 21, 2008, 12:38pm

We had big discussions and some experiements as well on this point… FInally, we decided that the “argument” cannot be used for a conditional __syncthreads…

Search the forum if you would be interested

Topic		Replies	Views
How do bar.sync and __syncthreads interact? CUDA Programming and Performance	4	1166	December 10, 2018
A stupid question on __syncthread() function CUDA Programming and Performance	5	5242	May 17, 2022
__syncthreads thread syncronization CUDA Programming and Performance	7	18574	October 27, 2009
__syncthreads() inside an if-then block CUDA Programming and Performance	11	10638	July 18, 2009
Any hang demo when a __syncthreads() used in conditional code? CUDA Programming and Performance cuda	3	279	June 9, 2023
__threadfence_block vs. __syncthreads CUDA Programming and Performance	1	4157	November 27, 2009
barrier before sync necessary? CUDA Programming and Performance	0	275	January 3, 2019
Strange __syncthreads behavior CUDA Programming and Performance	2	1043	January 21, 2014
Can __syncthreads exist in multiple control path ? CUDA Programming and Performance	3	3974	July 24, 2008
__syncthreads() is ignored by threads CUDA Programming and Performance	4	7615	December 5, 2011

__syncthreads() in divergent branches by giving an argument to bar.sync in PTX

Related topics