Noticed that if you have a variable (float in my case) which is exactly zero, and call atomicAdd(&addr[index],var), that it will access the memory location even though the end result of the memory update operation will be no change.
Does this sound right?
I suppose this is useful if you want to get the current value (since atomic operations have a return value), but in situations where you just want to update and are not using the return value for any reason this is a waste of time.
If I checked the value for non-zero status before I invoked the atomicAdd(), and avoided the memory access (if zero) it resulted in a modest performance increase. This probably was due to the input set having about 30% zero values, so that checking only is useful in such situations.
Wonder if it might make sense to write a modified atomicAdd() which did not access the memory if the input value is either zero or negative? Anybody try such a custom implementation?
Any downsides to that approach rather than checking for non-zero before the atomicAdd() call?