multi-thread programming

I have a problem. code as follow

[codebox]float* d_A;

static CUT_THREADPROC cputhread()

{

printf("thread begin\n");

float * h_A = (float*)malloc (sizeof (float)* 2);

cutilSafeCall(cudaMemcpy(h_A, d_A, sizeof (float) *2, cudaMemcpyDeviceToHost)); // error line

printf ("h_A[0]=%f,h_A[1]=%f",h_A[0],h_A[1]);

CUT_THREADEND;

}

int main(int argc, char* argv) {

cutilSafeCall(cudaMalloc((void**) & d_A, sizeof (float) * 2));

CUTThread cputhreadid;

    cputhreadid = cutStartThread((CUT_THREADROUTINE)cputhread, NULL);

    cutWaitForThreads(&cputhreadid, 1);

cudaFree(d_A);

}

[/codebox]

Why is it wrong?

cudaSafeCall() Runtime API error in file <sor.cu>, line (error line): invalid device pointer.

Thanks.

The runtime is telling you that d_A isn’t a valid pointer. Why it isn’t a valid pointer is impossible to say based on that code.

I have no idea why. so need help.

Probably a context problem. Contexts are tied to the thread in which they are established and all memory allocations are only valid in the context in which they are created. If the use of d_A is in a different thread to the thread which allocated it, it will be in a different context. The problem is that I have no idea what those cutThread calls do - they are not part of CUDA, they have no documentation and they are host OS dependent, so (like I already said), it is impossible to say with any certainty what the problem is.

Perhaps d_A needs to be declared for device?

__device__ float* d_A;

nope, avidday was already correct.