question about page locked memory

bluebit · April 21, 2009, 3:14pm

Heya

What exactly is page locked memory? I understand it has higher bandwidth on the bus between CPU and GPU, resides on the host’s DRAM, and can be portable across CUDA contexts. But why not just use GPU memory directly? I’d much rather use GDDR3 any day, unless I know of the benefits of pinned memory on the host. So…

(1) what is it?
(2) why use it?
(3) when should i NOT use it?

Thanks :)

jph4599 · April 21, 2009, 4:36pm

There’s a lot of info spread throughout posts on the forum…I’ll highlight a couple…

1 and 2: http://forums.nvidia.com/index.php?s=&…st&p=286686

3: http://forums.nvidia.com/index.php?s=&…st&p=497317

From what I gather, Pinned memory is great if you are going to be copying data back and forth between the CPU and GPU quite often but may not be that beneficial if you’re not doing many transfers…

Check out the topics above and other topics regarding pinned memory for more info

seibert · April 21, 2009, 7:15pm

There is some confusion in this statement between page-locked memory and the zero-copy features of CUDA 2.2.

Page-locked memory by itself is only directly accessible to the host, but as the above post mentions, data can be copied between the page-locked memory and the GPU without an intermediate copy by the driver. With the exception of Core i7, there is generally a factor of two speed improvement on most systems in GPU memory transfer to/from page-locked host memory. (Triple channel Core i7 systems have so much memory bandwidth that there is almost no speed difference, amazingly, between page-locked and pageable memory.)

My general philosophy is to use page-locked host memory if I need to transfer anything to the GPU multiple times. The only reason not to use page-locked memory is if you don’t have control over memory allocation of the buffer (you have to use cudaMallocHost() to allocate page-locked memory) or if the size of your host buffer needs to be a significant fraction of the total memory in your computer. Page-locked memory cannot be swapped to disk (hence the name) and using too much of it reduces the amount of memory the OS has to devote to other tasks.

Zero-copy is an entirely new and different feature in CUDA 2.2. It lets you map a page-locked host buffer directly into the address space of the GPU, allowing you to perform CPU<->GPU transfers as needed while the kernel runs. The decision to use zero-copy memory depends a lot more on your algorithm specifics.

Also, page-locked memory is, by default, not portable between CUDA contexts. You can use it multiple host threads (with different contexts), but it will only be marked as page-locked in one that created it. In other contexts, it will be treated as pageable memory, triggering an unnecessary copy by the driver when used for GPU transfers. CUDA 2.2 also fixes this by allowing you to set a flag to allow all page-locked memory to be shared between OS threads.

Topic		Replies	Views
Is it possible to use pinned memory? Outside of CUDA CUDA Programming and Performance	7	6147	February 14, 2008
Page Locked Memory CUDA Programming and Performance	3	974	May 5, 2011
Memory-type quesions CUDA Programming and Performance	7	382	April 21, 2023
Zero Copy VS Page-Locked CUDA Programming and Performance	5	1113	September 19, 2011
Page Locked Memory CUDA Programming and Performance	5	4017	October 18, 2009
Why, how and when to use page locked host memory CUDA Programming and Performance	1	3051	July 8, 2009
pageable and non-pageable memory CUDA Programming and Performance	2	6343	December 31, 2008
Pinned memory size problem CUDA Programming and Performance	4	3841	December 11, 2009
Page-locked memory CUDA Programming and Performance	9	9037	April 8, 2009
Advantages/Disadvantages of using pinned memory CUDA Programming and Performance	6	12946	May 4, 2018

question about page locked memory

Related Topics