For my project I need to do a lot of in-place add and subtract operations on data, it would be great to be able to use frame buffer blending in CUDA so that this can be handled in the memory controller instead of a read-modify-write cycle.
Now in CUDA 0.9 something called ‘atomic operations’ was added that seems to be much like what I describe. Alas this is only supported with compute capability 1.1 (thus, 8600 not 8800). And it only works on integers, not floats/uchars.
Are there any plans to implement frame buffer blending in CUDA, as to allow some “ops on write” like addition, subtraction and max/min? I mean, on 8800 class hardware.