estimate an upper limit for pinned memory (windows, linux) - how ?

HannesF99 · September 4, 2017, 7:30am

Is it possible to calculate an estimate how much CPU memory I can allocate (in total) safely as ‘pinned memory’ (page-locked memory) with the ‘cudaHostAlloc’ function ?

The systems for which I would need such a ‘safe guess’ are windows (windows 7 & 10 with WDDM / TCC driver) and Linux (Ubuntu). All systems are 64-bit (x64).

I found a stackoverflow thread ( c++ - About pinned memory in CUDA, is there an upper limit on it? - Stack Overflow ), but it does not give much information.

HannesF99 · September 5, 2017, 8:19am

Found a related thread - https://devtalk.nvidia.com/default/topic/977088/cudamallochost-crash-since-update-from-cuda-7-0-28-to-8-0-44/?offset=6

So seems to be also driver (version) dependent, up to 5 - 10 GB seems to be OK.

njuffa · September 5, 2017, 8:38am

Since cudaHostAlloc() is a thin wrapper around appropriate OS API calls, you are at the mercy of whatever limits the OS imposes. 5-10GB may be allowed on system with tons of system memory, I doubt it will work if the system has only 8 GB.

I am not aware of any OS calls that would tell you upfront how much memory is pinnable. It may well depend of the allocation sizes, due to memory space fragmentation.

I would be somewhat surprised if there is a real dependency on driver version, unless the driver changes the sequence of OS calls it performs between version (this could change between major versions, but seems unlikely otherwise). I would expect this to be dependent on the state of the OS, so the amount of memory pinnable may differ between uptime 5 minutes and uptime 400 days, on the same system.

On Linux there are system debug utilities (strace? dtrace?) that should give you visibility into the calls made by the driver to pin the memory (I think mmap() on Linux but I could be wrong; it’s been many years since I looked at it). With that knowledge you might then inquire in an OS-specific support forum whether there is any way of knowing a-priori whether the call will succeed.

HannesF99 · September 5, 2017, 8:49am

I forgot to add that the systems where our software usually runs on (media / video processing) usually have lots of RAM.

For linux, I just saw there is the ‘getrlimit’ function in combination with the ‘RLIMIT_MEMLOCK’ flag.
See getrlimit(2): get/set resource limits - Linux man page and c - How much memory locked in a process - Stack Overflow

njuffa · September 5, 2017, 8:59am

I am not convinced those queries lead to a robust solution. As I understand it, all RLIMIT values are hard “not to be exceeded” limits. There may be myriad other reasons such a limit cannot be reached at any given time. In general, software should expect that any dynamic memory allocation requests (pinned or not) can fail and deal with the situation appropriately.

If failure during runtime is not an option (e.g. most embedded systems) the usual strategy is to grab all memory the program will ever require at program startup, putting it into a memory pool for example.

Topic		Replies	Views
Max amount of host pinned memory available for allocation CUDA Programming and Performance	8	7515	February 4, 2021
Limit the amount of pinned memory CUDA Programming and Performance	4	2944	June 5, 2009
What are the pinned memory limitations on CUDA for WSL2? CUDA Programming and Performance	8	1523	June 18, 2023
Arbitrary Device Limit On Pinned Host Memory CUDA Programming and Performance	8	2051	August 26, 2014
amount of pinned memory CUDA Programming and Performance	17	12289	December 4, 2008
Change limit of 50% for cudaHostAlloc pinned memory on Windows 10/11 CUDA Programming and Performance	9	2722	September 19, 2022
cudaHostRegister(): strange/unexpected behaviour under Windows 10 CUDA Programming and Performance	4	1085	October 22, 2019
Big pinned memory allocations CUDA Programming and Performance	1	423	March 14, 2019
Unexpected limit in cudaHostAlloc Failing to allocate large amounts of pinned/page-locked memory CUDA Programming and Performance	3	4077	December 6, 2010
Fast processing of large amounts of pinned memory CUDA Programming and Performance	2	712	August 29, 2017

estimate an upper limit for pinned memory (windows, linux) - how ?

Related topics