Adding data from multiple threads

So if I have float* array and I want multiple threads in a kernel to add to this array, will they be able to do this just by calling something like array = array + newValue? I don’t care what order things are added, just as long as by the end of execution they do get added.

Example:
const int ix = blockDim.x * blockIdx.x + threadIdx.x;
array[columnID] = array[columnID] + gpuA[ix];

ColumnID is defined seperately, so just assume that part is working.

No, that does not work. You will have to do some kind of reduction. I don’t think atomic float operations are supported yet in 1.3, but you can check in the programming guide. Otherwise you can do an atomicInc. But performance will be not as good as a reduction.

Ok, I kinda thought this, but wanted to make sure before I spent time on it. I’ll probably just wait until later on in the programs development to do this, as I’m not sure if the overall impact on performance will be the big, but figured if it was as easy as doing this then I might as well.

if you know how the reduction example works and keep that in mind when programming, then adding things up at the end will not be performance critical no :)