Beginer's question

I am a beginer about CUDA, just want to follow some examples in the Nvidia Instruction. So I try this sample program:

#include <stdio.h>
#include <stdlib.h>
#include <cuda.h>

global void vecAdd(float* A, float* B, float* C)
int i = threadIdx.x;

int main()
float A[2];
float B[2];
float C[2];
int N=2;


vecAdd<<<1, N>>>(A, B, C);
printf(“A=%f, B=%f,C= %f\n”, A[0], B[0], C[0]);
printf(“A2=%f, B2=%f, C2=%f\n”, A[1], B[1], C[1]);

But it doesn’t work. Both C[0] and C[1] print to be 0. I think this is some basic problem I need to figure out. Thanks for everybody who can help.

The kernel “thinks” that A, B and C point to device memory and therefore interpretes these adresses as adresses of device mem. But actually A, B, C are pointers to arrays in host mem.