Is it fair to say that before unified (managed) memory came into play, UVA allowed the GPU to tap into the memory of the host, but not vice-versa?
As early as CUDA 2.0 there was support for zero-copy access on the device of host memory allocated via cudaHostAlloc. That became even simpler with CUDA 4.0, when there was no need for pointer demangling via cudaHostGetDevicePointer.
Yet accessing device memory directly on the host became available only in CUDA 6.0, with the introduction of Unified Memory.
Is my understanding correct?