atomicAdd looking for an explanation of the mysterious atomic functions

Hello,

my dev-env is:
vs2010
nsight 1.5
cuda sdk 3.2

Today was my first time working with the atomic functions.
I thought it was as easy as using every other function, but it wasn’t.
After reading some blogs/posts, I found out that I have to add -arch=sm_12 to the nvcc arguments in order to use the functions.

So now I’ve got some questions.
Why is this not working with a normal #include of the device_function.h, respectively the sm_11_atomic_functions.h?
Why is it not working with sm_11 even though the function is defined in the sm_11_atomic_functions.h file??

I would appreciate every single explanation ;)

Hello,

my dev-env is:
vs2010
nsight 1.5
cuda sdk 3.2

Today was my first time working with the atomic functions.
I thought it was as easy as using every other function, but it wasn’t.
After reading some blogs/posts, I found out that I have to add -arch=sm_12 to the nvcc arguments in order to use the functions.

So now I’ve got some questions.
Why is this not working with a normal #include of the device_function.h, respectively the sm_11_atomic_functions.h?
Why is it not working with sm_11 even though the function is defined in the sm_11_atomic_functions.h file??

I would appreciate every single explanation ;)

The issue is that atomic functions require the compiler to emit special PTX instructions that are only supported on certain CUDA devices. This is why you need to pass an extra flag to the compiler to tell it that you are only going to run on devices at a certain compute capability (or greater).

The issue is that atomic functions require the compiler to emit special PTX instructions that are only supported on certain CUDA devices. This is why you need to pass an extra flag to the compiler to tell it that you are only going to run on devices at a certain compute capability (or greater).