Missing Kernel executions

OnkelDagobert · June 27, 2012, 9:27am

Hi,

I’m a CUDA beginner and i have some strange issues with my code…

Kernel:

__global__ void _testkernel( 	unsigned int* debug){

	unsigned int idx = threadIdx.x;

	++debug[idx];	

}

Host:

{

	unsigned int block_size = 512;

	unsigned int* d_debug;

	CUDA_SAFE_CALL(  cudaMalloc((void**)&d_debug, block_size*sizeof( unsigned int)));

	CUDA_SAFE_CALL(  cudaMemset(d_debug,0,block_size*sizeof(unsigned int)));

	dim3 dimBlock; dimBlock.x = block_size;

	dim3 dimGrid; dimGrid.x = 512;

	_testkernel<<< dimGrid, dimBlock>>>(d_debug);

CUDA_SAFE_THREAD_SYNC();

	// check for error

	cudaError_t error1 = cudaGetLastError();

	if(error1 != cudaSuccess)

	{

		// print the CUDA error message and exit

		printf("CUDA error: %s\n", cudaGetErrorString(error1));

	}

	

	unsigned int * h_debug;

	CUDA_SAFE_CALL(  cudaMallocHost(&h_debug, block_size*sizeof(unsigned int)));

	CUDA_SAFE_CALL(  cudaMemcpy(h_debug, d_debug, block_size*sizeof(unsigned int), cudaMemcpyDeviceToHost));	

	for(unsigned int bt=0; bt<block_size;++bt){

		std::cerr << h_debug[bt] << ";";

	}

}

this is what i expected:

512;512;512;512;512;512;512;512;512;512;512;512;512;512;512;512;512;512;512;512;512;512;512;512;512;512;512;512;512;512;512;512;[...]

512;512;512;512;512;512;512;512;512;512;512;512;512;512;512;512;512;512;512;512;512;512;512;512;512;512;512;512;512;512;512;512;

and this is what i get:

20;20;20;20;20;20;20;20;20;20;20;20;20;20;20;20;20;20;20;20;20;20;20;20;20;20;20;20;20;20;20;20;

21;21;21;21;21;21;21;21;21;21;21;21;21;21;21;21;21;21;21;21;21;21;21;21;21;21;21;21;21;21;21;21;

19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;

18;18;18;18;18;18;18;18;18;18;18;18;18;18;18;18;18;18;18;18;18;18;18;18;18;18;18;18;18;18;18;18;

17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;

15;15;15;15;15;15;15;15;15;15;15;15;15;15;15;15;15;15;15;15;15;15;15;15;15;15;15;15;15;15;15;15;

18;18;18;18;18;18;18;18;18;18;18;18;18;18;18;18;18;18;18;18;18;18;18;18;18;18;18;18;18;18;18;18;

16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;

16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;

14;14;14;14;14;14;14;14;14;14;14;14;14;14;14;14;14;14;14;14;14;14;14;14;14;14;14;14;14;14;14;14;

16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;16;

17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;

17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;17;

19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;

19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;19;

20;20;20;20;20;20;20;20;20;20;20;20;20;20;20;20;20;20;20;20;20;20;20;20;20;20;20;20;20;20;20;20;

I have no Idea what I’m doing wrong. Can someone help me please.

OS: openSUSE Tumbleweed 64bit

GeForce GTX 570

Driver Version: 295.59

CUDA Toolkit 4.2.9

tera · June 27, 2012, 9:53am

You need to use atomic operations for the increment, because multiple blocks will execute in parallel.

OnkelDagobert · June 27, 2012, 10:13am

thx it worked

Topic		Replies	Views
Increment a device variable each time a kernel function gets executed. CUDA Programming and Performance	3	2129	April 6, 2012
Need a little help to understand how thread change/works CUDA Programming and Performance	4	3900	December 10, 2011
QUIT CUDA? Kernel and pinned memory gives strange results CUDA Programming and Performance	6	6792	September 22, 2011
[Beginner] Memory is reseted in the kernel CUDA Programming and Performance	5	1381	October 29, 2010
Bug report: Incorrect block scheduling CUDA Programming and Performance	18	7882	February 19, 2010
do not understand thread/block division CUDA Programming and Performance	10	2908	April 23, 2012
simple cdot implementation beginner question CUDA Programming and Performance	4	5856	August 21, 2007
Losing CUDA calculatons CUDA Programming and Performance	5	2380	March 21, 2011
Really simple while loop issues CUDA Programming and Performance	4	3285	October 27, 2014
Blocks execution problem Unexpectable behavior CUDA Programming and Performance	4	861	February 2, 2011

Missing Kernel executions

Related topics