Hello!
I’m trying to run a simple program but I get a very strange problem. Here is my simple kernel:
__global__ void test(unsigned int* V, unsigned int* F, unsigned int* data, int size) {
int b = blockIdx.x;
int t = threadIdx.x;
int ct = t + BLOCK_SIZE * b;
if (ct <= size) {
int node = data[ct];
int x = node/BASE;
V[x] = V[x] | (1<<(node%BASE)); // *** ///
}
}
When I run it with size = 3 and all the arrays given correctly the row marked *** just doesn’t produce any result. Here comes the weird thing - I try the following code:
__global__ void test(unsigned int* V, unsigned int* F, unsigned int* data, int size) {
int b = blockIdx.x;
int t = threadIdx.x;
int ct = t + BLOCK_SIZE * b;
if (ct <= size) {
int node = data[ct];
int x = node/BASE;
if (ct == 0)
V[x] = V[x] | (1<<(node%BASE)); // *** ///
}
}
As you can see the only change is the addition of the IF operator. In this case the *** row works only for the thread with ct = 0. I tryed it also with ct == 1, ct == 2 and it works only for the respective thread. If I switch to device emulation mode, than everything works as expected. Any ideas?