I am using cudaHostAlloc to allocate mapped memory. I know that that i need to use cudaThreadSynchronize() so that the changes to mapped memory can be seen after I call a GPU function.
The question is do i need to use cudaThreadSynchronize after i change values on the mapped memory using the CPU so it is updated on the GPU
ie:
int* mapped;
int* mapped_device;
cudaHostAlloc(mapped … )
cudaHostGetDevicePointer(mapped_device, mapped, … )
mapped[0] = 1;
cudaThreadSynchronize(); <--------- Do I need this
kernel<<<… , … >>>(mapped_device);
thanks