Beginer's question

I am a beginer about CUDA, just want to follow some examples in the Nvidia Instruction. So I try this sample program:

#include <stdio.h>
#include <stdlib.h>
#include <cuda.h>

global void vecAdd(float* A, float* B, float* C)
{
int i = threadIdx.x;
C[i]=A[i]+B[i];
}

int main()
{
float A[2];
float B[2];
float C[2];
int N=2;

A[0]=5.0f;
A[1]=1.0;
B[0]=10.0f;
B[1]=3.0;

vecAdd<<<1, N>>>(A, B, C);
printf(“A=%f, B=%f,C= %f\n”, A[0], B[0], C[0]);
printf(“A2=%f, B2=%f, C2=%f\n”, A[1], B[1], C[1]);
}

But it doesn’t work. Both C[0] and C[1] print to be 0. I think this is some basic problem I need to figure out. Thanks for everybody who can help.

The kernel “thinks” that A, B and C point to device memory and therefore interpretes these adresses as adresses of device mem. But actually A, B, C are pointers to arrays in host mem.