Maximum size of page locked memory

worc1154 · February 2, 2009, 2:57pm

Hi,
Just wondering if there is a maximum size for page locked memory under Linux? The machine I had been assigning it on had only 1GB of memory, so assumed it was running out of available space to allocate when the code bailed. I have since added another 2GB, but with no apparent effect. Is there some other software limit?

Daniel

mfatica · February 2, 2009, 4:33pm

There is a limitation of 4GB for page locked memory.
With CUDA 2.2, this limitation will go away.

worc1154 · February 2, 2009, 4:35pm

Thanks for the confirmation, though clearly this is not what I am hitting so I’ll investigate further.

Daniel

tmurray · February 2, 2009, 5:59pm

I’ve never seen a machine that couldn’t allocate 80-90% of its physical memory as page-locked memory for CUDA (up to 4GB in pre-2.2), so I really don’t know what you’re hitting. However, I haven’t tried this on 32-bit Linux ever either, so that might be something to consider.

E.D_Riedijk · February 2, 2009, 6:58pm

So, when are >4 Gb cards coming? ;)

mfatica · February 2, 2009, 7:42pm

You will be able to copy a sub-matrix to/from GPU memory from/to a huge matrix in CPU memory.
The sub-matrix is going to be limited by the memory on the card.

E.D_Riedijk · February 2, 2009, 8:28pm

Ahh offcourse. Should have thought of that, still in the process of understanding block-wise matrix algo’s…

Mr_Nuke · March 9, 2009, 7:43pm

Does that mean nVidia has >4G devices in the oven?

worc1154 · March 9, 2009, 10:01pm

The Tesla cards are already 4GB

Daniel

Mr_Nuke · March 9, 2009, 10:23pm

Yeah, I know that. They’re quite expensive. >4GB means greater than 4GB.

worc1154 · March 9, 2009, 10:29pm

Ok, I would be surprised to see bigger than 4GB, as I believe it is a 32bit processor, though larger amounts of page locked memory could be very useful on the host. As for larger commodity cards, I think that will depend on when games etc have a use for the extra space :-)

Daniel

seibert · March 10, 2009, 12:41am

On 64-bit hosts, device pointers are already 64-bit. There’s no reason to suspect that CUDA devices can’t address more than 4 GB of memory. I imagine the limitation is more one of market/price and not technology. :)

tmurray · March 10, 2009, 12:52am

tsk, tsk… sizeof(CUdeviceptr) == 4.

seibert · March 10, 2009, 1:03am

Wait, what? Then why all the argument about 64 bit pointers slowing down CUDA kernels on 64 bit hosts? [Damn search engine won’t show me where that thread is.]

Mr_Nuke · March 10, 2009, 1:24am

Yes, it is a 32-bit processor. That’s why supporting anything over 4GB will require a major rework of the architecture. I’m actually impressed of how nVidia managed to put the full 4GB on a 32-bit device.

I agree that more that 4GB of page-locked memory could be helpful… but only on systems with 2 or more S1060s and powered by a Nehalem with triple-channel DDR3 or a dual+ processor system… Anything else simply wouldn’t have enough memory bandwidth to satiate two transfers.

Mr_Nuke · March 10, 2009, 1:32am

You’re probably refering to a thread I started a while back.

Pointers are treated by the 64-bit nvcc compiler as being 8-byte wide to be consistent with the size of pointers on the host. In very specific cases, this increases the register usage of a kernel vs its 32-bit counterpart to the point that it lowers its occupancy. Thus the kernel becomes slower. I have one such headache-producing kernel.

I’m not sure how CUdeviceptr works, but it’s most likely declared as an unsigned int, not a true pointer like void*. But if you have a float*, for example, that will be treated as 8-bytes wide even within the device on 64-bit compilation.

Here’s the thread: http://forums.nvidia.com/index.php?showtop…mp;#entry502288

MJH22 · March 18, 2009, 3:41pm

Check the value of ‘max locked memory’ in the output of ‘ulimit -l’. It can be overriden with ulimit or by changing ‘memlock’ in /etc/security/limits.conf .

Topic		Replies	Views
About pinned memory and its effectiveness CUDA Programming and Performance	3	1328	April 15, 2009
Maximum of page-locked memory? CUDA Programming and Performance	2	5663	August 17, 2009
Big pinned memory allocations CUDA Programming and Performance	1	422	March 14, 2019
Significant decrease of available page-locked memory at Win7 x64 vs. Win7 x32 CUDA Programming and Performance	3	5687	June 18, 2011
Maximal allocatable memory block 1.7 GB is the limit? CUDA Programming and Performance	4	9775	November 18, 2009
estimate an upper limit for pinned memory (windows, linux) - how ? CUDA Programming and Performance	4	1589	September 5, 2017
memory size or pointer is too large to fit in 32 bit pointer CUDA Programming and Performance	9	9745	November 13, 2010
How is 4GB addressable on 32bit? CUDA Programming and Performance	10	9227	August 21, 2009
Unable get over 512MB of page-locked memory with cudaHostRegister or cudaMallocHost... CUDA Programming and Performance	3	2980	July 2, 2012
amount of pinned memory CUDA Programming and Performance	17	12288	December 4, 2008

Maximum size of page locked memory

Related topics