atomicAdd() during loop not work well but at end work well

majid_dor · May 20, 2010, 10:25am

hi

i have a problem with atomicAdd() function. my kernel is like this:

int value = 0;

int i = blockIdx.x * blockDim.x + threadIdx.x;

if (i < N *M ){

value = atomicAdd((int *)&old[0], 1);

C[i] =  value;

}

i expect that the result be like this:

C[0] = 0, C[1] = 0, … C[i] = i , C[NM-1] = NM-1.

but results are like this only for some element of matrix C such as C[NM-1],C[NM-2], … C[NM-k] but for C[0], C[1], … C[NM-k-1] the value that saved in each element is not the expected value mentioned above. how can i solve this problem to my kernel work well?

Preetha · May 20, 2010, 10:45am

hi

i have a problem with atomicAdd() function. my kernel is like this:
int value = 0;

int i = blockIdx.x * blockDim.x + threadIdx.x;

if (i < N *M ){

value = atomicAdd((int *)&old[0], 1);

C[i] =  value;

}
i expect that the result be like this:

C[0] = 0, C[1] = 0, … C[i] = i , C[NM-1] = NM-1.

but results are like this only for some element of matrix C such as C[NM-1],C[NM-2], … C[NM-k] but for C[0], C[1], … C[NM-k-1] the value that saved in each element is not the expected value mentioned above. how can i solve this problem to my kernel work well?

The order of execution for cuda threads cannot be defined.

You are trying to add 1 to an existing variable using atomic funtion, expecting that all the threads will get invoked in the order 0, 1, 2, …N*M-k-1. But this is not the case, the threads will get invoked in some other order. This is the reason why you are getting unexpected results.

May be, if you try this in device emulation mode you should get the expected result, since the threads will get executed in 0, 1, 2, …N*M-k-1 order there…

avidday · May 20, 2010, 10:56am

hi

i have a problem with atomicAdd() function. my kernel is like this:
int value = 0;

int i = blockIdx.x * blockDim.x + threadIdx.x;

if (i < N *M ){

value = atomicAdd((int *)&old[0], 1);

C[i] =  value;

}
i expect that the result be like this:

C[0] = 0, C[1] = 0, … C[i] = i , C[NM-1] = NM-1.

but results are like this only for some element of matrix C such as C[NM-1],C[NM-2], … C[NM-k] but for C[0], C[1], … C[NM-k-1] the value that saved in each element is not the expected value mentioned above. how can i solve this problem to my kernel work well?

You might be interested in this thread, which discussed basically the same thing.

majid_dor · May 20, 2010, 7:46pm

thanks!

Topic		Replies	Views
Get different results for every running with atomicAdd() CUDA Programming and Performance	2	411	October 3, 2022
AtomicAdd() functions CUDA Programming and Performance	1	793	December 9, 2016
AtomicAdd result incorrect CUDA Programming and Performance	3	1684	December 29, 2018
can you give me sample code for atomicAdd()? CUDA Programming and Performance	9	48812	June 5, 2009
Result of atomicAdd in kernel CUDA Programming and Performance	10	157	July 22, 2024
atomicAdd not behaving as expected, atomicAdd_system not defined CUDA Programming and Performance	3	1622	September 5, 2022
incorrect results from atomicAdd (maybe the method is incorrect) CUDA Programming and Performance	1	3804	May 2, 2010
What I am doing wrong with atomicAdd() CUDA Programming and Performance	5	2398	November 1, 2010
atomicAdds within two loops CUDA Programming and Performance	5	923	October 12, 2021
Why different opeartions in thread affect atomicAdd() CUDA Programming and Performance cuda	4	999	April 26, 2022

atomicAdd() during loop not work well but at end work well

Related topics