How host access Device global memory?

Hi everyone,

I have a question about how host process access device global memory?

In CUDA 4.0, Pointers returned by cudaHostAlloc() can be used directly from within kernels running on UVA(Unified Virtual Addressing) enabled devices.

So kernel can R/W host memory for sure.

With UVA support, can host R/W the device memory directly?

or does it exist a way to R/W global memory directly?

similar to the following code. i know that it’s wrong. just take an example.

float *d;

cutilSafeCall( cudaMalloc((void**)&d, sizeof(float)*100));

d[0] = 1;

or the only way to write global memory are cudaMemset or cudaMemcpy? example as follow:

float *d;

cutilSafeCall( cudaMalloc((void**)&d, sizeof(float)*100));

cutilSafeCall( cudaMemset(d, 1, sizeof(float)*1));

and only way to read global memory is cudaMemcpy too? i’m confused.

because my scenario is –

  1. host process need to read the device memory which kernel is executing (like CPU polling device memory)

  2. when host process find the device memory is modified, do something. (like transfer data back or sth.)

In other words, my question is that can host process polling device memory when kernel is executing

thank u all