serialize threads


My questions sounds a little bit strange. Is it possible to sleep one thread till another one finishes his job. I tried doing something like this:

int current_thread = 0;

__global__ void kernel()


  int thread_index = threadIdx.x + blockDim.x * blockIdx.x;

 if (thread_index == 0) {

    /* do some stuff */



  } else {

    /* here other threads prepare some things for me */


    /* kill some time */

    while (current_thread != thread_index)



    /* do some stuff */




It looks stange, but this while loop just freezes everything, algorithm finishes its job, but it is very slow. I want to do something like semi-parallel (if there is such word): if its not that threads time to work, it just prepares some things and waits for its turn. Is there any way? Thanks.


you can use syncthreads to synchronize threads in a block. synchronization between blocks is not generally possible during the runtime of a kernel

It looks like a sequential program. Why do you need to run it in CUDA?

No it won’t work. To expand on E.D. Riedijk, all the threads in a grid don’t run at once (only a few blocks at a time run on the card). You can’t wait on threads that haven’t even been loaded into the GPU yet.