I want to get the values of an array in a kernel function using a global pointer from the host. So I first defined an array pointer in the kernel file:
device unsigned int *result;
then, I initialize it in the main function:
CUDA_SAFE_CALL( cudaMalloc((void **)&result, NUM) );
also, I defined a host array in main function like this:
unsigned int *h_result;
h_result = (unsigned int *)malloc(NUM);
I operate the device array in the kernel function, for example:
result[threadIdx.x] = threadIdx.x + 100;//just a operation illustrate
then, I get the array of result using cudaMemcpy:
CUDA_SAFE_CALL( cudaMemcpy(h_result, result, NUM, cudaMemcpyDeviceToHost) );
but all the program always get an error.
so can any one can tell me where I am wrong?