Strange problem with a simple program

Hello!

I’m trying to run a simple program but I get a very strange problem. Here is my simple kernel:

__global__ void test(unsigned int* V, unsigned int* F, unsigned int* data, int size) {

		int b = blockIdx.x;

		int t = threadIdx.x;

		int ct = t + BLOCK_SIZE * b;

		if (ct <= size) {

				int node = data[ct];

				int x = node/BASE;

				V[x] = V[x] | (1<<(node%BASE)); // *** ///

		}

}

When I run it with size = 3 and all the arrays given correctly the row marked *** just doesn’t produce any result. Here comes the weird thing - I try the following code:

__global__ void test(unsigned int* V, unsigned int* F, unsigned int* data, int size) {

		int b = blockIdx.x;

		int t = threadIdx.x;

		int ct = t + BLOCK_SIZE * b;

		if (ct <= size) {

				int node = data[ct];

				int x = node/BASE;

				if (ct == 0)

						V[x] = V[x] | (1<<(node%BASE));  // *** ///

		}

}

As you can see the only change is the addition of the IF operator. In this case the *** row works only for the thread with ct = 0. I tryed it also with ct == 1, ct == 2 and it works only for the respective thread. If I switch to device emulation mode, than everything works as expected. Any ideas?

I found the solution - atomicOr :)