help from the float atomic add How to release a float atomic add in the old device

Hello everyone

There is a trouble for me.

I write a code in cuda in whcih the float atomic add will take almost half of the running time.
My device is 9800gtx, and there is no atomic add for integer, so I can’t use the built-in atomic add.
I release a float atomic add by myself.
There is the code

device void atomicaddfloat(float *pa,int *atomicadd,float &b)
{
bool leaveloop=true;
do{
if(atomicAdd(atomicadd,1)==0)
{
leaveloop=false;
*pa = *pa+b;
*atomicadd=0;
}
}while(leaveloop);
}

Using a secondary integer array(atomicadd), the function works.
But the efficiency is not satisficing for me.

Is there any body can help me?

Yes, just search the forum: http://forums.nvidia.com/index.php?showtop…rt=#entry451309

Yes, just search the forum: http://forums.nvidia.com/index.php?showtop…rt=#entry451309

Thanks lot, it is very helpful.

Thanks lot, it is very helpful.