Does anyone have any solid information on performance of atomic functions? I’ve looked all around, and the only thing I can find is ‘it’s slow’. Just how slow is that? In particular, how do atomic functions perform when trying to write to contested versus non-contested memory locations?
Basically, I want to know if they are suitable for building a spin lock, in this case, for scattering points into a large texture. The texture is expected to be on the order of 1kx1k elements, with writes done more or less randomly, so the lock would only block occasionally.
The code would look like this:
while (atomicExch(&lock, 1)!=0) //spin until we get a lock
lock = 0; //unlock
lock is an integer stored per element, so this is a fine grained lock.