Compute Capability 1.0 and atomic functions


I’m sitting on a computer with Geforce 8800 Ultra, so I only have Compute Capability 1.0. But I was going to write a function where I need the atomicAdd function. Is there any way to do this with that card, or do I have to find another way?

Are you planning to add to the same global memory address from several blocks, or just from several threads within the same block? It is possible to serialize writes from within the same block without using atomics, but I do not know any way to serialize writes of different blocks using just compute capability 1.0. A workaround is to have separate counters per thread block and then to do a final pass with the CPU to get the result (this is also done in the histogram SDK sample when no atomics are used). If you have a lot of blocks, a second kernel call to summate the final result by reduction may be used.

I have to serialize from many blocks. I’ll look at the histogram SDK sample. Thanks a bunch!