CUDA supports atomic operations on integers , will the future version of CUDA support the atomic operations on floating points ?
This is purely my own speculation, but I wouldn’t expect this to ever be supported in hardware. My thinking is that integer operations can be implemented in relatively few transistors with relatively low latency which can be implemented within the memory controller itself. FPUs are far more costly in terms of transistors. Adding a FPU to the memory controller (or each of the several memory controllers) just for atomic instructions is unlikely.