Increment Global device Issue

bob1029 · June 23, 2008, 4:13pm

I am trying to accumulate results from every thread in a device variable. When I run the program, it will only have results from the last thread to touch the variable.

For example:

__device__ float result=0;

__global__ void gpuCode()

{

   result+=threadIdx.x;

}

void host()

{

 float result_h;

 gpuCode<<<1,N>>>();

 cudaMemcpyFromSymbol((void*)&result_h,result,sizeof(float));

 printf("result: %f",result_h);

}

santyhyammer · June 23, 2008, 4:47pm

You can’t do that. To perform that action you need to use reduction or atomic instructions for integer numbers.

That works like a CPU… you’ll need a critical section/InterlockedIncrement to avoid race conditions and to sync writes… Like CUDA does not have critical sections, use the AtomicAdd for compute 1.1 or above hardware or make a reduction kernel.

SPWorley · June 24, 2008, 12:50am

And atomic operations won’t work with float operations.
Look at the SDK example project “Reduction”. It’s exactly what you want.

Topic		Replies	Views
problem about the GPU thread CUDA Programming and Performance	2	1083	May 7, 2009
How to use multithread to accumulate one variable CUDA Programming and Performance	3	2277	October 24, 2008
accumulating floats accross threads in a block is there and atomicAdd + sync for floats? CUDA Programming and Performance	1	2279	January 26, 2009
writing to the same global variable by different threads CUDA Programming and Performance	4	4572	December 9, 2009
Is it possible to increment a variable by different threads at the same time ? CUDA Programming and Performance	3	1960	November 10, 2009
Adding data from multiple threads CUDA Programming and Performance	3	3398	June 20, 2008
Many threads updating a single global variable CUDA Programming and Performance	7	6935	March 30, 2012
Awkward error on simple addition inside thread CUDA Programming and Performance	8	1214	June 26, 2017
can one force two operations to occur atomically together? CUDA Programming and Performance	2	1561	June 30, 2015
Summing array elements using kernel Access frome the whole block grid CUDA Programming and Performance	3	917	July 16, 2010

Increment Global __device__ Issue

Related topics

Increment Global device Issue