AtomicAdd() functions

rong-1234 · December 8, 2016, 9:10am

Hello guys, currently I am using atomicadd() functions to add value to an array. However, it works fine when the number of thread is less (around 800) but it doesn’t work correctly when I am dealing with 2.4millions of threads. Here’s is part of my code:

#define TILEWIDTH 32

for (i = 5; i >= 0; i–)
{

xy = (xy << 1) + (p[threadIdx.y * 32 * 4 + threadIdx.x * 4 + i].t >= (uint)d_p[(blockIdx.z + 3)(d_roi_height)(d_roi_width) + (threadIdx.y+1+(blockIdx.yTILEWIDTH))(d_roi_width) + (threadIdx.x+1)+(blockIdx.xTILEWIDTH)]);

printf(“Block id x: %d\tBlock id y: %d\tBlock id z: %d\tthreadid.x: %d\tthreadid.y: %d\txy:%d\n”, blockIdx.x, blockIdx.y, blockIdx.z, threadIdx.x, threadIdx.y, xy);
}
atomicAdd(&d_basic_xy[xy], 1.0f);
}

The printf function is used to check the value of xy. From the printf, xy=0 appear 100 times but the atomicadd function only add 8 into it.(suppose to be 100)

Anyone knows how to solve it? Is it because the number of threads is too huge hence the kernel is kicked out by the clock before it finished all its work?

episteme · December 9, 2016, 4:20am

I can’t re-produce the difficulty.

#include <cuda_runtime.h>
#include <iostream>

__global__ void myKernel(unsigned int* sum) {
  atomicAdd(sum, 1U);
}

int main() {

  unsigned int* sum;
  cudaMallocManaged(&sum, sizeof(unsigned int));
  *sum = 0U;

  myKernel<<<dim3(32,32,10),dim3(32,32)>>>(sum);
  cudaDeviceSynchronize();
  std::cout << *sum << std::endl;

  cudaFree(sum);
  cudaDeviceReset();
}

it works. show me exact result: 10485760 (= 323210 * 32*32) equals to the number of threads.

Topic		Replies	Views
atomicAdd not behaving as expected, atomicAdd_system not defined CUDA Programming and Performance	3	1622	September 5, 2022
atomicAdd() during loop not work well but at end work well CUDA Programming and Performance	3	1241	May 20, 2010
cuda atomicAdd() got problem with a big array under fedora 11 AtomicAdd() goes wrong with big array CUDA Programming and Performance	3	7321	March 3, 2010
cuda atom plus a limited number of operations CUDA Setup and Installation	1	849	August 9, 2013
incorrect results from atomicAdd (maybe the method is incorrect) CUDA Programming and Performance	1	3804	May 2, 2010
atomicAdd() function problem CUDA Programming and Performance	3	949	August 5, 2014
CUDA BUG? atomicAdd CUDA Programming and Performance	1	6148	March 21, 2009
Get different results for every running with atomicAdd() CUDA Programming and Performance	2	411	October 3, 2022
Atomic functions problem CUDA Programming and Performance	8	1918	May 30, 2009
atomicAdd problems. CUDA Programming and Performance	3	2399	April 13, 2011

AtomicAdd() functions

Related topics