hi all,

I am trying to write a program that reads and write in same array. for example my cuda kernel code is as below…

**global** void floydWarshall_kernel( int* array, int n)

{

int k;

int n=numberOfNodes;

int j = threadIdx.x;

int i = threadIdx.y;

if((i<n)&&(j<n)&&(i!=j))

{

```
for (k = 0; k < n; ++k)
{
int ij,ik, kj;
ij=graph[j*n+i];
ik=graph[k*n+i];
kj=graph[j*n+k];
if ((ik * kj!= 0) && (i != j))
if ((ij>=ik+ kj) || (ij == 0))
graph[j*n+i]= ik+kj;
}
```

}

}

with this kernel i am not getting the expected result. I guess there is some race condition among threads that read and write data from same memory address. how to avoid the race condition. I read that using __syncthread() function we can avoid race condition. Can anyone clarify how and where to use the function in above kernel? Also the function synchronizes threads only within the same blocks. how to avoid race conditions between threads that resides in different blocks?

Any suggestions would be highly appreciated.

Thanks