Hello everyone,
I’m starting in NVIDIA’s CUDA platform and I started doing some tests. I’m basically testing a vector sum using the following code (only showing what I think are relevant parts):
global void vecAdd(float * A, float * B, float * C) {
int i = threadIdx.x;
C[i] = A[i]+B[i];
}
int main(int argc, char ** argv) {
cudaError_t error;
//Declare local stuff
float A = {2.0f, 3.0f, 4.0f, 5.0f, 6.0f};
float B = {2.0f, 4.0f, 6.0f, 8.0f, 10.0f};
float C[5];
//Move stuff to device
float * Ad, * Bd, * Cd;
cudaMalloc((void **)&Ad,5*sizeof(float));
cudaMemcpy(Ad, A, 5*sizeof(float), cudaMemcpyHostToDevice);
cudaMalloc((void **)&Bd,5*sizeof(float));
cudaMemcpy(Bd, B, 5*sizeof(float), cudaMemcpyHostToDevice);
cudaMalloc((void **)&Cd,5*sizeof(float));
//Do stuff
vecAdd<<<1,4>>>(Ad, Bd, Cd);
//Check for errors
error = cudaGetLastError();
if (error != cudaSuccess) {
printf("Error: %s\n", cudaGetErrorString(error));
}
//Retrieve stuff
cudaMemcpy(C, Cd, 5*sizeof(float), cudaMemcpyDeviceToHost);
printf("C: {%.2f,%.2f,%.2f, %.2f, %.2f}\n",C[0], C[1], C[2], C[3], C[4]);
cudaFree(Ad);
cudaFree(Bd);
cudaFree(Cd);
return 0;
}
Code compiles fine, no error on execution neither but the strange behavior is as follows: If I modify the source decreasing the number of threads, I still get the same results on vector C. I noticed that was strange so after modifying the number of threads for values above and below 5 without any changes, I even changed the content of array B. Nothing changed neither. That makes me think of a previously allocated result in the device which I need to erase to have expected behavior, but I can’t find any of this theory in the documentation. Am I missing something? Some other interesting experiments are: I rebooted the computer and compiled the code with only one thread, and it threw me a reasonable result with nan in the all values but the first one; then I changed it to two threads, and also gave nan on all values but the first two ones; then I tried with 3, everything normal, but when I started decreasing the numbers, it wouldn’t change.
Additional system info which might be very helpful:
- Nvidia Driver: 177.67 x86_64
- CUDA Toolkit: 2.0 for Ubuntu 7.10 x86_64
- System: Frugalware Linux x86_64
- Device: GeForce 8800GT 256MB
Thanks for any help you might give!
David E. Narváez