I guess this can not fully utilize the property of float4? Maybe just float4 is not considered by atomicAdd? Would you suggest the developer to include this in the future?
Thank you!!!
========================================================
By the way, I find this: use float AtomicAdd to write to a float4
12 years passed, not sure whether atomicAdd can accept float4 now…Haha…
Atomics can work on up to a 64-bit (properly-aligned) quantity at a time. So you cannot do an atomic add on a float4 considering the entire float4 quantity - that is a 128 bit quantity. Furthermore the atomic engine doesn’t know anything about elementwise addition for a vector type.
The only types supported by atomicAdd are those listed in the programming guide. And if you wanted to perform multiple additions across various elements of a vector type, each of those additions would need to be handled by a single atomicAdd instruction.
You could possibly construct your own custom Atomic to perform an atomic addition on two of the quantities at a time. I think that is unlikely to provide better performance than 2 native atomics, however.