Atomic operations versus frame buffer blend How do these relate?

For my project I need to do a lot of in-place add and subtract operations on data, it would be great to be able to use frame buffer blending in CUDA so that this can be handled in the memory controller instead of a read-modify-write cycle.

Now in CUDA 0.9 something called ‘atomic operations’ was added that seems to be much like what I describe. Alas this is only supported with compute capability 1.1 (thus, 8600 not 8800). And it only works on integers, not floats/uchars.

Are there any plans to implement frame buffer blending in CUDA, as to allow some “ops on write” like addition, subtraction and max/min? I mean, on 8800 class hardware.

Access to video memory in CUDA is done via load/store,and doesn’t go through the normal graphics raster operations like blending. We don’t have any plans to expose blending or any other raster ops in CUDA.

The 8500/8600 and future chips include additional hardware to perform atomic operations.

It’s possible that the atomic operations might be extended in future hardware to support additional types such as floats.

I agree it’s a shame that the 8800 doesn’t support atomic operations, but graphics hardware evolves, and if we added every feature to every chip we’d never ship anything!

What about atomic operations on shared memory? Or byte or bit wise?

Atomic operations are currently only supported on global memory. I can’t comment on future products.

Currently the atomic operations operate on 32-bit ints only. You can do bit-wise operations using iAtomicOr/And/Xor.