Fill output buffer from multiple threads

Is the use of Atomics and multiple threads writing to the same memory generally a bad idea compared to unique memory location and reduction?

Depends on how often the atomic blocks other threads. It can’t be faster than a standard write. Sometimes you cannot avoid it.
Also since there are no vectorized atomics, you would need to use one atomic for each color component.
Multi-GPU is another topic which requires attention.

Check this post: https://forums.developer.nvidia.com/t/best-strategy-for-splatting-image-for-bidir/111000/2

1 Like