Host function call cannot be configured

Hi,
I am just trying to get the vectorAddition sample work but I get
“Host function call cannot be configured”

So my kernel invocation is:
//Kernel Invocation Code
addVector<<<1,1>>>(cx,cy,size);

What am I doing wrong here?

Also, I am not sure I understand the concept of threadid’s.
In vector addition, I do something like this:

int tx = threadIdx.x;
for(int k = 0;k < arrSize;k++){
float dx1 = dx[txarrSize+k];
float dy1 = dy[tx
arrSize+k];
dx1 = dx1+dy1;
//Write the result to the device memory - each thread writes
//one element
dx[tx*arrSize+k] = dx1;
}

Is this correct ? Can someone pl explain how threadid’s work.

Thanks,
Supraja J
(P.S. I have attached my code. )
MyFirstCUDAProgram.cu (3.84 KB)

Place the kernel code before the kernel invocation, not after.