Will this cause a thread race?

Here is the small function. The important paramter is “d_convParam” which is defined as a device parameter at the start of the code.

__device__ void convergence(float *Told, float *Tnew)
{
        int index = blockIdx.x * blockDim.x + threadIdx.x;
	if(index < Nx*Nz)
	{
		float a = abs(Tnew[index] - Told[index]);
		if(a < 0.00001)
		{
			d_convParam += 1;
		}
	}
}

Can two or more threads try to update “d_convParam” at the same exact time, and thus cause one or mote updates to be missed? My calculation’s full convergence is reached when d_convParam = Nx*Nz. This can only happen when every single thread successfully executes “d_convParam += 1”. A thread race issue would prevent that from happening. Do I have one here?

Yes, You can have race condition here.

There is an atomic add function in CUDA that can be helpful. You might also consider rethinking Your algorithm that it will not cause race condition. This piece of code looks to me a bit like a kind of histogram calculation. There is not a one solution for such problem.

MK