estimate an upper limit for pinned memory (windows, linux) - how ?

Is it possible to calculate an estimate how much CPU memory I can allocate (in total) safely as ‘pinned memory’ (page-locked memory) with the ‘cudaHostAlloc’ function ?

The systems for which I would need such a ‘safe guess’ are windows (windows 7 & 10 with WDDM / TCC driver) and Linux (Ubuntu). All systems are 64-bit (x64).

I found a stackoverflow thread ( https://stackoverflow.com/questions/22300100/about-pinned-memory-in-cuda-is-there-an-upper-limit-on-it ), but it does not give much information.

Found a related thread - https://devtalk.nvidia.com/default/topic/977088/cudamallochost-crash-since-update-from-cuda-7-0-28-to-8-0-44/?offset=6

So seems to be also driver (version) dependent, up to 5 - 10 GB seems to be OK.

Since cudaHostAlloc() is a thin wrapper around appropriate OS API calls, you are at the mercy of whatever limits the OS imposes. 5-10GB may be allowed on system with tons of system memory, I doubt it will work if the system has only 8 GB.

I am not aware of any OS calls that would tell you upfront how much memory is pinnable. It may well depend of the allocation sizes, due to memory space fragmentation.

I would be somewhat surprised if there is a real dependency on driver version, unless the driver changes the sequence of OS calls it performs between version (this could change between major versions, but seems unlikely otherwise). I would expect this to be dependent on the state of the OS, so the amount of memory pinnable may differ between uptime 5 minutes and uptime 400 days, on the same system.

On Linux there are system debug utilities (strace? dtrace?) that should give you visibility into the calls made by the driver to pin the memory (I think mmap() on Linux but I could be wrong; it’s been many years since I looked at it). With that knowledge you might then inquire in an OS-specific support forum whether there is any way of knowing a-priori whether the call will succeed.

I forgot to add that the systems where our software usually runs on (media / video processing) usually have lots of RAM.

For linux, I just saw there is the ‘getrlimit’ function in combination with the ‘RLIMIT_MEMLOCK’ flag.
See https://linux.die.net/man/2/getrlimit and https://stackoverflow.com/questions/5762386/how-much-memory-locked-in-a-process

I am not convinced those queries lead to a robust solution. As I understand it, all RLIMIT values are hard “not to be exceeded” limits. There may be myriad other reasons such a limit cannot be reached at any given time. In general, software should expect that any dynamic memory allocation requests (pinned or not) can fail and deal with the situation appropriately.

If failure during runtime is not an option (e.g. most embedded systems) the usual strategy is to grab all memory the program will ever require at program startup, putting it into a memory pool for example.