I really wonder “what” is the Unified Memory. I mean In HW, there are GPU’s memory like shared memory, L1, L2 Memory etc.
and CPU memory. There are no “Unifed Memory” in real parts. In NVIDIA’s document, they said [with single pointer, we can
access memory from CPU or GPU].
In first. I thought using Unified Memory means GPU and CPU using host memory with single pointer. But I think that is wrong.
So, my question is this. If we use Unified Memory, what parts really used that time?
usually this feature called “virtual memory”. it means that address space for both memories are common, and CPU/GPU hardware ensures that memory page will be copied to the device trying to use it. so you don’t need to worry - just alloc memory and use it, and hardware+driver will take the rest
Are there any benchmarks, or comparative tests, that indicate the latency incurred by using main memory?
Of course, I assume that Unified Memory is optimised to:
give a higher priority to GPU memory in the combined memory space; and
the job of marshalling data between memory-spaces.