Atomic functions problem

shiftreduce · May 28, 2009, 4:21am

Hello all!

I’m having a weird problem while trying to use the atomicAdd() function…This is my simple kernel:

__global__ void k1(int *out, unsigned int *index) {

	

	   if(threadId.x > 3) {

			   int resultIndex = atomicAdd(index, 1);

			   out[resultIndex] = 5;

	  }

}

This is just a test, where I want to implement a filter and output the results to an output buffer. The thing is that after calling this kernel from the host and copying the output buffer into the host memory, when I print the result it’s allways 0. But if I take the atomicAdd() line (and write to the threadIdx.x position in the array) it does print 5…

What’s going on here?

SPWorley · May 28, 2009, 6:00am

Hello all!

I’m having a weird problem while trying to use the atomicAdd() function…This is my simple kernel:
__global__ void k1(int *out, unsigned int *index) {

	

	   if(threadId.x > 3) {

			   int resultIndex = atomicAdd(index, 1);

			   out[resultIndex] = 5;

	  }

}
This is just a test, where I want to implement a filter and output the results to an output buffer. The thing is that after calling this kernel from the host and copying the output buffer into the host memory, when I print the result it’s allways 0. But if I take the atomicAdd() line (and write to the threadIdx.x position in the array) it does print 5…

What’s going on here?

It may be something simple like the *index pointer being a host pointer by accident, or accidentally using a block size of 1 wide and 64 high instead of the converse.

Can you post the 5-10 lines of code you use to allocate and copy your memory, call the kernel, then copy and print the results?

jph4599 · May 28, 2009, 1:06pm

Does your device support atomics?

Cygnus_X1 · May 28, 2009, 1:10pm

Is value under *index reset to 0 at the beginning? :)

shiftreduce · May 28, 2009, 1:52pm

This is the code where I allocate memory on the host:

unsigned int *index;

cudaMalloc((void**) index, sizeof(*index));

cudaMemcpy(index, 0, sizeof(int), cudaMemcpyHostToDevice);

How can I check my device compute capability?

Cygnus_X1 · May 28, 2009, 2:07pm

cudaMemcpy(index, 0, sizeof(int), cudaMemcpyHostToDevice);

This copies an int from address 0 to index. At address 0 you have probably some garbage.

Surprised this didn’t raise a segmentation fault, but maybe cudaMemcpy operates on some higher priviledge level which bypass normal OS securities.

Try this:

unsigned int cpuValue=0;

cudaMemcpy(index, &cpuValue, sizeof(int), cudaMemcpyHostToDevice);

Regarding your Compute Capability - what graphics card do you have?

YDD · May 28, 2009, 2:26pm

Use the deviceQuery program in the SDK. It prints out the compute capability of each device (among other useful information).

shiftreduce · May 30, 2009, 2:36am

Thanks! This was the problem…I thought there was no need to allocate memory when copying a static value…

Thanks all, problem solved!

Cygnus_X1 · May 30, 2009, 8:30am

This should also work:

cudaMemset (index, 0, sizeof(int));

yet I seldom use it - it is good for clearing data but not for setting some more complex initial values.

Topic		Replies	Views
AtomicAdd() functions CUDA Programming and Performance	1	753	December 9, 2016
can you give me sample code for atomicAdd()? CUDA Programming and Performance	9	48325	June 5, 2009
atomicAdd() during loop not work well but at end work well CUDA Programming and Performance	3	1186	May 20, 2010
atomic functions CUDA Programming and Performance	2	770	July 9, 2015
How can I make sure atomicAdd() was successful? CUDA Programming and Performance	4	3363	March 12, 2017
The atomic functions do not provide correct results CUDA Programming and Performance cuda	4	384	March 26, 2021
AtomicAdd result incorrect CUDA Programming and Performance	3	1595	December 29, 2018
What I am doing wrong with atomicAdd() CUDA Programming and Performance	5	2322	November 1, 2010
Why does a kernel which contains atomic functions return correct result unless I insert a printf() to check it? CUDA NVCC Compiler cuda , kernel , windows-driver-solutions	0	503	March 3, 2023
Simple Integer ADD program error Result is always zero CUDA Programming and Performance	2	7893	February 3, 2011

Atomic functions problem

Related topics