Hi,
I am just trying to get the vectorAddition sample work but I get
“Host function call cannot be configured”
So my kernel invocation is:
//Kernel Invocation Code
addVector<<<1,1>>>(cx,cy,size);
What am I doing wrong here?
Also, I am not sure I understand the concept of threadid’s.
In vector addition, I do something like this:
int tx = threadIdx.x;
for(int k = 0;k < arrSize;k++){
float dx1 = dx[txarrSize+k];
float dy1 = dy[txarrSize+k];
dx1 = dx1+dy1;
//Write the result to the device memory - each thread writes
//one element
dx[tx*arrSize+k] = dx1;
}
Is this correct ? Can someone pl explain how threadid’s work.
Thanks,
Supraja J
(P.S. I have attached my code. )
MyFirstCUDAProgram.cu (3.84 KB)