Pinned memory

eyalhir74 · August 20, 2019, 10:25am

Hi,
A few questions related to pinned memory:

When allocating pinned memory (possibly 256MB - 1GB), it seems some memory is being allocated on the currently selected GPU RAM. Why exactly? Is there some logic in the size that is being allocated? I need to know as I’m pre-allocating most of the GPU RAM and the pinned allocations sometimes fails because (of what looks like) an insufficient space in the GPU RAM.
When copying a non-pinned buffer to the GPU, does the driver allocates chunks of pinned memory and does some/many intermediate copies from my pageable memory to the pinned memory to the device?
If so, what are the sizes of those intermediate buffers?

thanks
Eyal

njuffa · August 20, 2019, 4:06pm

For host->device copies, if the amount of data is small, it will be sent as part of the command stream. Larger copies use the DMA mechanism, for which the driver allocates a pinned buffer through which the data is copied in chunks.

It used to be that “small” ~= up to tens of kilobytes, and pinned buffer size in driver ~= single-digit megabytes, large enough to achieve good throughput for the DMA transfers. The sizes are not documented since they are implementation artifacts that could change between driver versions. With a bit of clever benchmarking you could probably reverse engineer what the sizes are for any given driver version, but this is not really something CUDA programmers should worry about (at least I have never had a need to do so, and using that kind of information tends to make one’s code brittle).

eyalhir74 · August 21, 2019, 12:05pm

Thanks for the reply :)
Are those pinned buffers allocated by the driver are cached? i.e. in multiple copy calls the driver will NOT reallocate pinned buffers over and over but use the previous pre-allocated pinned buffers?

what would you suggest for a big (1-2GB) non pinned buffers - let the driver handle those copies through the DMA use you’ve described or try some custom solution on my own?

Also, when allocating 100MB-1GB pinned buffers, why is there an overhead in the device memory and is there some formula to estimate how much device/GPU memory will be required?

thanks
Eyal

njuffa · August 21, 2019, 2:39pm

I can’t see inside the driver, but it is a reasonable assumption that the pinned “transfer buffer” created by the driver is allocated once, and reused as often as necessary until the driver is unloaded.

eyalhir74 · August 22, 2019, 5:48am

Hi,
Any insights as to the why and how to estimate the “overhead” in device memory for my own pinned memory allocations?

thanks
Eyal

Topic		Replies	Views
Does pageable memory have higher memory consumption than pinned memory? CUDA Programming and Performance	5	1072	October 12, 2021
cudaMemcpy to non-pinned memory CUDA Programming and Performance	5	1758	October 12, 2021
Pinned memory size problem CUDA Programming and Performance	4	3948	December 11, 2009
Fast processing of large amounts of pinned memory CUDA Programming and Performance	2	728	August 29, 2017
How does the copy back & forth from GPU work ? (any nvidia tech in the audience ?) CUDA Programming and Performance	4	1256	August 30, 2010
Can I create a pinned memory buffer to support overlapping compute/copy without cudaMallocHost overhead CUDA Programming and Performance cuda	13	837	November 3, 2020
Pinned Memory Allocation Why should it be driver specific? CUDA Programming and Performance	8	3260	September 1, 2009
Pinned memory concept - windows driver CUDA Programming and Performance	0	1503	January 20, 2012
Is it possible to use pinned memory? Outside of CUDA CUDA Programming and Performance	14	6323	January 22, 2025
Advantages/Disadvantages of using pinned memory CUDA Programming and Performance	6	13799	May 4, 2018

Pinned memory

Related topics