First CUDA program... looks good, executing wrong?

omdown · June 24, 2009, 2:49pm

Alright, I don’t even know what to make of this anymore, I’ve tried simplifying this down as much as I can and I still can’t figure this out. Simple addition of two arrays. The first element returned is the correct sum of the two values. The second through n-th elements are random values, not sure where they’re coming from. I’m sure this is one of those “duh” moments but forgive me, I’m completely new to CUDA and parallel programming in general. Does anything jump out at anyone about this?

/*****************************************

			MATRIX BUILDER

*****************************************/

void buildMatrix(float* matrixElements){

	for(uint i = 0; i < 100; i++){

		// generate a random value n { 0 < n < 9 }

		matrixElements[ i ] = rand() % 10;

	}

}

/*****************************************

			CUDA MATRIX ADDER

*****************************************/

__global__ void addKernel(float* A, float* B, float* C) {

	C[threadIdx.x] = A[threadIdx.x] + B[threadIdx.x];

}

/******************************************

			MAIN PROGRAM

******************************************/

int main()

{

	srand( time (NULL) );

	// CREATE A

	float elementsA[100];

	float elementsB[100];

	float elementsC[100];

	// FILL WITH RANDOM ELEMENTS

	buildMatrix(elementsA);

	buildMatrix(elementsB);

	// ALLOCATE THE ELEMENTS TO THE DEVICE

	float* deviceElA;

	float* deviceElB;

	float* deviceElC;

	cudaMalloc((void**) &deviceElA, sizeof(elementsA));

	cudaMalloc((void**) &deviceElB, sizeof(elementsB));

	cudaMalloc((void**) &deviceElC, sizeof(elementsC) * sizeof(float));

	cudaMemcpy(deviceElA, elementsA, sizeof(elementsA), cudaMemcpyHostToDevice);

	cudaMemcpy(deviceElB, elementsB, sizeof(elementsB), cudaMemcpyHostToDevice);

	// DISPATCH TO THE KERNEL

	addKernel <<<1, 100>>>(deviceElA, deviceElB, deviceElC);

	// COPY THE VALUES BACK

	cudaMemcpy(elementsC, deviceElC, sizeof(deviceElC), cudaMemcpyDeviceToHost);

	// ITERATE THROUGH THE RESULTS

	for(int i = 0; i < 100; i++){

		cout<<i<<":  "<<elementsA[i]<<" + "<<elementsB[i]<<" = "<<elementsC[i]<<endl<<endl;

		cout.flush();

	}

	cudaFree(deviceElA);

	cudaFree(deviceElB);

	cudaFree(deviceElC);

	return 0;

}

The whole thing compiles without complaint, and executes without error, except for returning garbage values, example below:

0:  2 + 5 = 7

1:  1 + 6 = 3.27853e-39

2:  1 + 2 = -1.55064

3:  7 + 8 = 3.47529e-39

4:  7 + 2 = 7.00649e-45

5:  8 + 4 = -2.91408e-05

6:  2 + 3 = 0

7:  9 + 0 = 0

8:  9 + 4 = 1.4013e-45

9:  4 + 7 = -2.91491e-05

.... (cut out the rest for sake of it all being the same)

Many thanks to anyone who can slap me across the back of the head and point out what I’m doing wrong… because I’m pretty confused and frustrated…

Nico · June 24, 2009, 3:04pm

cudaMemcpy(elementsC, deviceElC, sizeof(deviceElC), cudaMemcpyDeviceToHost);

Here, you’re calculating the size of a pointer to memory, an not the size of the array it points to.

It’s better to calculate the size of an array as numberOfElements*sizeof(element)

N.

sigismondo · June 24, 2009, 3:06pm

got it!

sizeof(deviceElC) = 4 - you are copying just 4 bytes

deviceELC is a pointer

omdown · June 24, 2009, 3:09pm

Oh ffs… I knew this was going to be pointer related… I’m so abhorrently bad with pointer management it’s not funny. -_-;

Thanks, sorry for the dumb question. lol

Topic		Replies	Views
Why it doesnt work ? Simple program that adds two vectors CUDA Programming and Performance	6	3963	March 18, 2010
My first program it doesn't behave as expected CUDA Programming and Performance	2	2542	July 19, 2009
Program gives unexpected error compiles smooth, but output is unexpected result CUDA Programming and Performance	5	3380	October 17, 2011
I got the wrong result from matrix summation CUDA Programming and Performance	2	565	June 1, 2011
[help][beginner] gpu programming CUDA Programming and Performance	2	427	October 30, 2020
2Arrays addition program give wrong results for large size arrays CUDA Programming and Performance	4	618	August 28, 2016
Intro CUDA - Matrix Multiplication Returning Odd Values CUDA Programming and Performance	1	5737	June 25, 2009
Beginer's question CUDA Programming and Performance	1	1144	April 17, 2009
Matrix Addition CUDA Programming and Performance	2	2050	June 14, 2012
Strange memory gremlins Getting pwned by pointers CUDA Programming and Performance	9	12274	July 1, 2009

First CUDA program... looks good, executing wrong?

Related topics