Increment a device variable each time a kernel function gets executed.

What I wish to achieve is

  1. I should have a single variable updated in different blocks.

So when I pass the execution configuration like <<<N,1>>> specifying that N blocks have 1 thread each, every time in the kernel function a variable should get incremented.

For eg : I have some code like this

#include<iostream>

#include<stdio.h>

using namespace std;

__global__ void test(int *a,int *b)

{

	int i;

	i=blockIdx.x;

	if(i%2)

		a++;

	else

		b++;

	printf("%d -- %d\n",a,b);

}

int main()

{

	int *da,*db;

	int a=0,b=0;

	cudaMalloc((void**)&da,sizeof(int));

	cudaMalloc((void**)&db,sizeof(int));

	cudaMemcpy(&da,&a,sizeof(int),cudaMemcpyHostToDevice);

	cudaMemcpy(&db,&b,sizeof(int),cudaMemcpyHostToDevice);

test<<<30,1>>>(da,db);

	cudaThreadSynchronize();

	getchar();

	return 0;

}

So at the end of the above code , shouldn’t variables a and b contain 15 in both? Because they are both in the global memory and change should be reflected in each call to the kernel function.

I get random values each time in a and b in the end. (Like sometimes 5 and 7, sometimes 4 and 2 etc.)

Any problem with the code? If not, then what might be the explanation?

Hi,
What you need is an atomic operation, since between the reading and the writing of the incremented data, another thread might have read it and thereafter discard the previous increment.
Look at atomicInc in the CUDA documentation.

thanks for the quick reply.
worked.

Be sure to put only one thread to increment the variable. something like this

if(threadIdx.x==0 & blockIdx.x==0) {atomicAdd(&counter,1));}