I have global function (F) in which there is a reduction function on an array(A), and after a global Reduction , I do some other operations which need the result of the reduction.
global void F(type A)
type redresult; redresult = globalReduction (A);
////here need all threads to wait the redresult being ready////////////////
alpha = redresult + …; //some operation needing result
My question is how can I make all threads wait for redresult to be ready? is threadfence() useful here?