Race Condition CUDA?

doclove69 · November 14, 2014, 4:49am

Hello All,

I am new to this forum but would like to say ‘hi’ and help with what is likely a simple problem but one that is giving me some issues. I am creating some code with CUDA that will execute using an array but can return wrong results - depending on the size of the problem. I think it could be a race condition issue but am unsure.

Can anyone look at the sample code and let me know if there is an obvious problem?

Thank you.

#define N 1024
#define TPB 512

__global__ void func(float4 *a, float *res, int n) {
  int idx = threadIdx.x + blockIdx.x * blockDim.x;
  if(idx < n) {
    for(int j = idx + 1; j < n; ++j) {
        res[idx] += a[idx].w*(a[idx].x - a[j].x);
        res[j] += a[j].w*(a[j].x - a[idx].x);
    }
  }
}

float4 *A;  // put data in before calling GPU Kernel
...

float *result = new float[N];
...

func<<<N/TPB, TPB>>>(dev_A, dev_res, N);

Robert_Crovella · November 14, 2014, 4:57am

Yes, you have a race condition. Consider the thread for which idx = 0 and the thread for which idx = 32.

Thread 32 can write to res[32] based on the first line of the body of the for-loop.

Thread 0 can write to res[32] based on the second line of the body of the for-loop.

There is no guarantee in which order threads will execute, and in fact these two threads could be executing "at the same time. Both could be trying to update res[32] simultaneously. The result will depend on who “wins” this race, and there is no way to tell or predict who will win

Then you may get varying results from run to run.

It is a race condition.

doclove69 · November 14, 2014, 1:42pm

That’s what I thought.

Thank you.

P.S. Any suggestion as to how to mitigate this with a simple code change or should I just rethink the algorithm itself?

cbuchner1 · November 14, 2014, 1:48pm

global atomics to the rescue!

doclove69 · November 15, 2014, 3:27pm

Thanks, global atomics worked.

Topic		Replies	Views
Race condition at a simple global array access CUDA Programming and Performance	1	981	November 21, 2013
how to avoid race condition? CUDA Programming and Performance	7	5640	October 23, 2009
Why won't my kernel work? CUDA Programming and Performance	3	4081	January 27, 2010
Race condition, CUDA Programming and Performance	0	553	April 14, 2014
Patterns in the outcome of a simple race condition CUDA Programming and Performance	2	178	September 12, 2024
Race condition? CUDA Programming and Performance	0	3328	October 14, 2009
Race condition? CUDA Programming and Performance	6	8332	December 5, 2009
write to global memory from multiple threads and racing conditions CUDA Programming and Performance	3	3346	April 26, 2009
Race condition? How to define thread-specific kind of variables? CUDA Programming and Performance	3	632	December 13, 2018
device global memory update questions CUDA Programming and Performance	7	5960	April 20, 2009

Race Condition CUDA?

Related topics