Atomic instructions for floats? trying to atomic add floats in a matrix

I am working on a FEM matrix-fill program where non-zero elements are computed and put in a compressed storage format. Off-diagonal elements are simply written to global memory after computation, but diagonals have contributions from additional threads not necessarily in the same block. One simple solution would be to atomic add the diagonal components. The matrix elements, however, are floats. Documentation only shows atomics for int and bool types, but since floats & ints are both 32bit this doesn’t seem like a hardware limitation. If anyone has already figured out atomics on floats, please respond. Otherwise, if you have any tips - or if you are looking for the same functionality - please join in.

There currently are no atomic operations on floats. Even though they are 32 bits, just as ints, arithmetic is quite different and associativity doesn’t hold due to finite precistion. You could fake atomic min/max operations on floats with the same sign by using corresponding atomic integer functions, since you can use integer comparisons for floats. I don’t think you can do the same for add/subtract.