Hi experts,

I am a beginner in CUDA programming. I am trying to parallelize a serial code which finds a convex envelop from a set of given points in a 2D plane.

The serial code looks like this :

void find_edges_on_host(float * h_x, float * h_y, int *h_edges)

{

//loop on all possible edges == pairs of points

for (int i=0; i<NPOINTS; i++)

{

for (int j=0; j<NPOINTS; j++)

{

if (i>=j)

{

continue; //edge i,j == edge j,i

}

//all the others points should be on the same side of the edge i,j

//normal to the edge (unnormalized)

float nx= - ( h_y[j]- h_y[i]);

float ny= h_x[j]- h_x[i];

…

…

// This is where race condition occurs

if (isValid)

{

//write the edge to h_edges in the first available position

if (h_edges[2*i]==-1)
{
h_edges[2*i]=j;

}

else

{

h_edges[2*i+1]=j;

}

//we write the edge two times for a direct access in the next stage

if (h_edges[2*j]==-1)
{
h_edges[2*j]=i;

}

else

{

h_edges[2*j+1]=i;

}

}

}

}

return;

}

For paralleizing, the for loops can be removed by using 2D threads. My problem lies with the ‘if’ block. For example, edge 1-2 and edge 1-3 are valid, then we have 2 threads with the same threadIdx.x = 1 who are trying to execute the statement

h_edges[2*i]=j;

Similarly there are other race condition possibilities exist.

My question is, is there any way I can remove this race condition without making use of functions like ‘AtomicAdd’ etc. ?

Can I re-structure the ‘if’ block in such a way that race condition will not occur ??

Kind regards