atomicAdd

I tried to use atomicAdd, but I got message “Error: unsupported operation”.
Do you have any idea?

In VS2009 I set “GPU Architecture” in Project->Properties->Configuration Properties->CUDA Runtime API->GPU to “compute_20,sm_20” (or higher). But it did’t help.

What is your gpu?

Tesla C2050

The only explanation is that the architecture was not set correctly. In linux we just add the flag -arch=sm_20, I am not sure how does this translate to Visual Studio.

Assuming you meant VS2008, then you should have three options called ‘GPU Architecture(X)’ where X is 1, 2 or 3 in the options page you mentioned (CUDA Runtime API->GPU). The Visual Studio compiler compiles for each of those to allow for multiple versions, so if any one of those three has a value of sm_1X then you will get that error. If all of your options are sm_20 or above then something fishy is going on.

If you aren’t sure whether it is doing it right, you can goto the tab ‘CUDA Runtime API->Command Line’ and copy/paste the big block of text into here for someone to check.

Usually I use only GPU Architecture1 with value “sm_20”, other have value “0”.
Command Line:
echo “C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v5.5\bin\nvcc.exe” -gencode=arch=compute_20,code=“sm_20,compute_20” --machine 64 -ccbin “c:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\bin” -Xcompiler “/EHsc /W3 /nologo /O2 /Zi /MT " -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v5.5\include” -maxrregcount=0 --compile -o “x64\Debug/MyCUDA.vcproj.obj” MyCUDA.vcproj

cd “x64\Debug”
findstr /L /I ““x64\Debug/MyCUDA.vcproj.obj”” “MyCUDA.device-link.options” >nul 2>&1
IF ERRORLEVEL 1 echo “x64\Debug/MyCUDA.vcproj.obj”>> “MyCUDA.device-link.options”
cd “c:\Projects\MyCUDA\MyCUDA”
“C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v5.5\bin\nvcc.exe” -gencode=arch=compute_20,code=“sm_20,compute_20” --machine 64 -ccbin “c:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\bin” -Xcompiler “/EHsc /W3 /nologo /O2 /Zi /MT " -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v5.5\include” -maxrregcount=0 --compile -o “x64\Debug/MyCUDA.vcproj.obj” “c:\Projects\MyCUDA\MyCUDA\MyCUDA.vcproj”

Well if you are only compiling for one arch and that is >2.0 that shows that it is not an architecture problem, out of curiosity what are you trying to atomicAdd? If it is not a compile problem maybe there is some quirk or limitation on atomicAdd that you have accidently found?

I’m trying to do use atomicAdd for float parameters in global function
float res=10;
atomicAdd(&res,5);

Is this example http://www.naic.edu/~phil/hardware/nvidia/doc/ReleaseNotes.html#simpleAtomicIntrinsics from nvidia sdk working ?

Are you trying to do an atomic add onto a register? So a quick repo case would be

__global__ void foo()
{
    float res = 10;
    atomicAdd(&res,5);
}

If so I am pretty sure that is invalid, and I don’t even know what that is trying to do. atomics only really make sense when you are playing with memory shared between threads, so global or shared memory. The purpose is to remove the race condition where multiple threads update the same memory location at the same time leaving you with some undetermined output. Since each thread has its own copy of the ‘res’ register they can all uniquely update their own version with no impact. If you are trying to have each thread do the add sequentially then res would need to be declared as shared (that was res would end up as 10 + (5 * numberOfThreads) ).

Tiomat