critical section problem critical section problem using atomicCAS

I want to using multi thread to modify the global variable , so i need to create a critical section.

In the criaical section, i use atomicCAS to modify the g_odata , and i can control only one thread enter critical section.

The first thread enter the critical section when g_odata[0] = 1,and immediately set g_odata[0] = 0 to avoid other threads

enter the critical section.

The thread exits the critical section,and it will set g_odata[0] = 1 to allow other threads have the right enter the critical.

In the following code, i use the method to implement the critical section, but i don’t know why my computer always break

down when i execute the program.

Does anybody know what happend on my program??

Thank you very much.

[codebox]testKernel(int* g_odata)

{

////////// critical section start ////////

 while(1)

 {

      //g_odata[0] is initialized to 1

      if(atomicCAS(&g_odata[0],1,0) == 1)

      break;

 }

//modify any global variable or shared variable in the critical section

g_odata[0] = 1;

 /////////     critical section end  //////  

}[/codebox]

I did some research on a similar problem and described the results in this thread.

My hypothesis is that when the warp diverges, one branch executes while the other branch goes “on hold”. When the executing branch reaches the join point, it then waits while the other branch resumes. When both branches have reached the join point, the warp is no longer divergent and resumes executing. If one branch is on hold, and the other branch is waiting for a lock which the other branch will never release (because it’s on hold), then this causes deadlock.

:thumbup: :thumbup: Thank you very much.

My original critical section can’t allow different threads to access the same global variable at a time.

I try a new way to implement critical section which can’t allow different warps to access the same global variable at a time.

the code is following, it does not hang.

[codebox]

if((threadIdx.x%32) == 0){

    /////////// critical section start //////////////

    while(1){

        if(atomicCAS(&g_odata[0],1,0) == 1)

        break;

    }           

    //modify any global variable or shared variable in the critical section

g_odata[0] = 1;

    /////////// critical section end ///////////////

}

[/codebox]

I use the critical section which can’t allow different warps to access the same global variable at a time.

I use 1 block and 64 threads(two warps) to run the kernel.

The 0th thread is the first thread of the first warp, and the 32th thread is the first thread of the second warp.

I use 0th thread and 32th threads to enter critical section to access the global_location, and I don’t care which thread is the first thread to enter the critical section.

But the first thread to access the global_location will get the value 0 (the global_location is initialized to 0),and save to the shared variable location, and accumulate the global_location to 32.

The second thread to access the global_location will get the value 32,and save to the shared variable location.

Finally, I get the shared variable by the warp_id ,and save to the output[0~63].

I expect the output[0~31] are all the value 0 or 32, and the output[32~64] are all the value 32 or 0.

The following code can output the answer I expect , but I must add the __syncthreads(). I don’t know why I must add the instruction, if i remove the instruction , i will get the output[0] is 0 or 20, and the output[32] is 20 or 0, but the other output[1~31] output[33~63] are the initial value 0.

[codebox]Kernal (global_location,lock,output[0~63])

{

shared int location[2];

int warp_id = threadIdx.x /32;

if((threadIdx.x%32) == 0){

    /////// critical section start ///////////

    while(atomicCAS(&lock,1,0) == 0);

location[warp_id] = global_location;

         global_location +=32;

lock = 1;

    /////// critical section end ///////////

}

__syncthreads();

output[threadIdx.x] = location[warp_id];

}[/codebox]

From the 2.1 programming guide, section 4.2.2.3 “shared”:

It appears that you are having a thread modify the shared variable and having other threads try to read it. The shared data is only guaranteed to be correct after a syncthread…

Thank you very much. :rolleyes: :thumbup: