All,
It’s possible that my method is incorrect…BUT, I’m trying to implement an atomicAdd to build a counter in my kernel.
Here is the code:
//----------------------------------------------------------//
shared int check[1]; //declare 1 cell counter array
check[0]=0; //initialize (maybe the problem is here?)
int x = blockIdx.x*blockDim.x+threadIdx.x;
int y = blockIdx.y*blockDim.y+threadIdx.y;
if(x <= rows && y <= cols){
atomicAdd(&check[0], 1); //do the add
atomic[x*cols+y] = check[0]; //write to an array for verification
//----------------------------------------------------------//
So, the problem is that when I print the “atomic” array from the host I just get all 9’s.
When I debug in cuda-gdb the atomicAdd works (changes check[0] from 0 to 1), then but in the next step check[0] becomes 9.
I’m pretty confused about this and would really appreciate any insight any of you might have!
I’m using a gtx285 primary card and gt 240 secondary card (cuda-gdb runs here) which have compute capabilities of 1.3 and 1.2, respectively.