howto resync all threads after a reduction

mcvicuna · June 22, 2011, 2:52pm

Hi,

I’m writing an iterative kernel that has a reduction as its last step.

I’m currently re-queuing the kernel as the threads get out of sync and goto the next iteration while the reduction is still ongoing and thus the free threads behave incorrectly.

I’m thinking of just reducing the threads so it runs on just one core so the threads stay synced and I can iterate on the GPU rather then the CPU.

Any advice? Does CUDA have this same problem?

Thanks,
MarkV.

laughingrice · June 26, 2011, 8:50am

If you need to sync on the ND-range level then your only choice is to sync at the CPU level (queue another kernel). If you need to sync at the work group level then you can use a barrier instead