Very beginner programming issue

Hello,

I’m very new to CUDA (and somewhat to C/C++ as well). At my institution, our CUDA computer is stuck running CUDA 4.2. We’re in the middle of building a new system, but it won’t be ready for at least 6 more months. The system is also Ubuntu 12. In the mean time, I’m trying to get a VERY basic code to work:

my file is: dumb2.cu

#include <cuda.h>
#include <stdio.h>

__global__ void AddIntsCUDA(int *a, int *b)
{
	a[0] += b[0];
}

int main()
{
	int a = 5, b = 9;
	int *d_a, *d_b;

	cudaMalloc(&d_a, sizeof(int));
	cudaMalloc(&d_b, sizeof(int));

	cudaMemcpy(d_a, &a, sizeof(int), cudaMemcpyHostToDevice);
	cudaMemcpy(d_b, &b, sizeof(int), cudaMemcpyHostToDevice);

	AddIntsCUDA<<<1,1>>>(d_a,d_b);

	cudaMemcpy(&a, d_a, sizeof(int), cudaMemcpyDeviceToHost);

	printf("5 + 9 = %i \n", a);

	cudaFree(d_a);
	cudaFree(d_b);

	return 0;
}

and in the command line (Ubuntu 12), I type:

nvcc -o dumb2 dumb2.cu
./dumb2

and I receive

5 + 9 = 5

which is clearly not right. It seems to me that the variable ‘a’ isn’t updating, but I’m not sure what I’m doing wrong. I appreciate any help I can get. Thanks.

add proper CUDA error checking to your code. Not sure what that is? Google “proper cuda error checking”

Your code looks OK to me. It may be that you have some issue with your system, and the error checking may give some clues about that.

Add error checking. If you do not know how to do that, do an internet search for “proper CUDA error checking”.

Thanks very much!

I implemented it in a brute force way:

#include <cuda.h>
#include <stdio.h>

__global__ void AddIntsCUDA(int *a, int *b)
{
	a[0] += b[0];
}

void errorCheck()
{
  cudaError_t error = cudaGetLastError();
  if(error != cudaSuccess)
  {
    // print the CUDA error message and exit
    printf("CUDA error: %s\n", cudaGetErrorString(error));
    exit(-1);
  }
}

int main()
{
	int a = 5, b = 9;
	int *d_a, *d_b;

	cudaMalloc(&d_a, sizeof(int));

	errorCheck();

	cudaMalloc(&d_b, sizeof(int));

	cudaMemcpy(d_a, &a, sizeof(int), cudaMemcpyHostToDevice);
	cudaMemcpy(d_b, &b, sizeof(int), cudaMemcpyHostToDevice);

	AddIntsCUDA<<<1,1>>>(d_a,d_b);

	cudaMemcpy(&a, d_a, sizeof(int), cudaMemcpyDeviceToHost);

	printf("5 + 9 = %i \n", a);

	cudaFree(d_a);
	cudaFree(d_b);

	return 0;
}

and I was given

CUDA error: CUDA driver version is insufficient for CUDA runtime version

The previous operator of this computer apparently wrote a number of custom drivers and left almost no documentation. When I run his legacy code, it works, but this code doesn’t. I can’t modify the drivers because then the legacy applications will no longer work…

Anyway, thank you for letting me know about CUDA error checking.