AtomicAdd

garciav · March 31, 2008, 12:24pm

Hi,

I’ve 2 questions:

How use atomicAdd in my kernel? Do I need to include something because atomicAdd seems to not be defined?
Do you think that the use of atomicAdd can speed-up this code:

for (i=0;i<N;i+=M){

...

}

transformed in

for (i=0;i<N;atomicAdd(&i,M)){

...

}

Thanks,

Vince

Simon_Green · March 31, 2008, 1:49pm

Atomics are only supported on compute 1.1. devices and later. You need to add “-arch sm_11” to the command line as described in the documentation.

To answer your question, no, atomicAdd is not likely to speed up that code. The “i” variable will be stored in a register per-thread and so the performance is already optimal.

MisterAnderson42 · March 31, 2008, 2:02pm

You need to add a compiler option to enable compilation for sm11 devices. I don’t remember the exact syntax: check the compiler help.
Using atomicAdd in that situation will certainly slow your performance by a huge factor: With all threads contending for access to the variable i, the entire execution will essentially be serialized.

bashflyng · April 2, 2008, 1:08am

I think you are misunderstanding what atomicAdd is for, as ‘i’ looks like a local variable, and you can’t use atomicAdd with that.

atomicAdd, as all atomic functions, is used to modify global memory without causing any race condition. It is used for “protection”, so don’t expect to get better performance compared to non atomic functions.

tanmay.Learns · April 3, 2008, 4:47pm

Why does atomicAdd (or most of the other atomic functions) support ONLY integer types?

wumpus · April 4, 2008, 6:54pm

Because floating point operations are not commutative; a+b does not neccesarily equal b+a. This means that using atomic operations, even though they are atomic, can give a different result depending on the order of operations resulting from the parallelism.

Of course you could use fixed point…

Topic		Replies	Views
Why does the atomicAdd work faster than â€˜+=â€™? CUDA Programming and Performance	3	12189	November 2, 2011
atomicAdd with float2 no API support, workarounds ? CUDA Programming and Performance	23	5192	January 28, 2021
Can I avoid using AtomicAdd with this kernel ??? CUDA Programming and Performance	9	3520	January 26, 2015
atomicAdd for floating point operations how to specify sm_20 arch CUDA Programming and Performance	7	19908	June 14, 2011
atomicAdd with short CUDA Programming and Performance	3	3041	February 13, 2011
Strange Performance Problem using atomicAdd CUDA Programming and Performance	4	1605	September 12, 2010
atomicAdd CUDA Programming and Performance	10	7222	September 26, 2013
AtomicAdd algorithm CUDA Programming and Performance	7	3811	August 25, 2009
Is there a way to avoid atomicAdd in my situation? CUDA Programming and Performance	3	1404	March 4, 2019
AtomicAdd with Visual Studio 2013 CUDA Setup and Installation	11	5500	February 26, 2015

AtomicAdd

Related topics