Device to Host Symbol Memcpy

I have a global array:

device unsigned int d_array[10];

I want to copy the value from one single array element to host memory, in host code. Is this possible?

I can call CudaGetSymbolAddress() to get the address of symbol d_array, but can I do any indexing or pointer arithmetic before calling CudaMemcpy(…, cudaMemcpyDeviceToHost) in order to copy a single array element?

I have had trouble doing this. Is the way to do this to copy the entire array rather than trying to copy elements of the array?

Ken Chaffin

You can do device pointer arithmetic on the host, you just can dereference device pointers. So something like

cudaMemcpy(hostpointer,devicepointer+index,sizeof(int), cudaMemcpyDeviceToHost)

should work, if devicepointer holds the address of the global memory symbol. Mind you, if your array is really only 10 integers, I doubt there will be any performance difference whether you copy one or ten.

Okay, that worked. What was causing me a segment fault was using devicepointer[index] rather than devicepointer+index as the argument. Now I have it working with either copying the entire array or just on element. Yes, in my test case there isn’t much difference in copying the array versus one element, but for the general case, I now can do either.

I assume that you meant to type “you just can’t dereference” rather than “you just can dereference” below? And, I suppose that devicepointer[index] is dereferencing the pointer whereas devicicepointer+index is not.


Ken Chaffin

I did.

devicepointer is, unsurprisingly, a device pointer, so trying to execute devicepointer[index] is deferencing it to reads the content of the address it contains, which is usually a segfault, as you have seen. If it were a host pointer, the correct syntax would be &devicepointer[index], but that still won’t work with a device pointer because it still requires dereferencing of the pointer.