My understanding of cpu to gpu transfer is as follows: if the data is in pagable memory and it is actually not on RAM, the OS creates a copy of the data in the pinned region, and is then transferred to the device memory. I have two questions:
- If the pagable memory buffer is present in RAM, does the OS lock the page in-place?
- If the pagable memory buffer is in secondary storage, why can’t it transfer using GPUDirect and skip a copy?
- How does the performance vary when allocating a pagable memory buffer vs allocating a pinned memory buffer?
Thanks!