I was wondering whether modern GPUs allocate an internal pinned buffer on the CPU when transferring data from GPU memory to non-pinned CPU memory.
In this blog post from 2012, it is stated that when you copy data from GPU memory to non-pinned CPU memory, pinned memory is implicitly allocated and used as an intermediate buffer in the transfer.
The reason given was that the GPU cannot directly access pageable memory because of possible page faults.
However, with UVM it seems that the GPU can now handle these faults, so this allocation may be unnecessary.
Do modern GPUs now directly transfer the data, or do they still allocate intermediate pinned buffers?
If they still allocate intermediate buffers, why are they needed?