amount of pinned memory

dvorkinp · December 1, 2008, 2:20pm

Can anybody say me what maximum amount of pinned memory (cudaMallocHost) can I allocate ? It seems that under XP-32 I can alloacte necessary for me 150 Mb while under Vista-64 can’t. Can this be ?

E.D_Riedijk · December 1, 2008, 3:08pm

This depends on your host machine, so I don’t think anybody can say anything useful about this.

alex_dubinsky · December 1, 2008, 8:07pm

See here: [url=“Memory Limits for Windows and Windows Server Releases - Win32 apps | Microsoft Docs”]Microsoft Docs - Developer tools, technical documentation and coding examples

In Vista x64, this is apparently 40% of RAM by default. I don’t know why you can’t allocate even 150MB, but there are various registry and security settings that restrict this. (You can crash a system by allocating pinned memory!) Search around for the term Microsoft uses, “non-paged memory” or “non-paged pool.”

theMarix · December 1, 2008, 10:35pm

I think it highly depends on your OS. On Linux I have successfully allocated more than 15 GByte of pinned memory leaving only around 300 MByte of physical memory for the OS. Now, you should be carefull about not triggering the out of memory killer if the system is out of your physical reach …

tmurray · December 2, 2008, 12:55am

I went to 7.7 of 8GB on my Linux64 box. Then it got upset and started killing applications. :(

alex_dubinsky · December 2, 2008, 1:25am

And you can do this as an unprivileged user?

dvorkinp · December 2, 2008, 7:25am

Must be for this program “Run as Administrator” or enough usual run ?

theMarix · December 2, 2008, 8:17am

On Linux, Yes!

alex_dubinsky · December 2, 2008, 10:36pm

See… an unprivileged user can use cudaMallocHost() to crash programs on Linux.

Vista does what it does for a reason. Score Microsoft: 1

alex_dubinsky · December 2, 2008, 10:37pm

I don’t know. There are probably different security policies for administrators and limited users. Have you tried it as an administrator?

tmurray · December 3, 2008, 12:44am

As far as I could tell it will only kill programs the current user is running.

theMarix · December 3, 2008, 6:40am

I’d second that.

Reimar · December 3, 2008, 11:32am

Uh, this has not that much to do with Vista or Microsoft. On Linux, you can specify how much pinned memory to allow with -ulimit.

But the NVidia graphics driver runs with highest priviledges and can of course ignore that and do stuff that

makes it unsuitable for a multiuser-system, but that works for Vista just as well except that Microsoft might not

sign your driver (and thus not allow it to run at such high priviledge levels) if they know about it.

Feel free to disable module loading support in Linux if you want the equivalent of only being able to run “approved” drivers :P .

And about the Linux OOM killer: it might kill anything, no matter which user owns it, but it tries to find the “best” process to kill (which usually means some large non-root process which has run only a short time).

theMarix · December 3, 2008, 12:04pm

In my experience most times the sshd process of the user that started the triggering application. ;)

dvorkinp · December 3, 2008, 12:44pm

Made simple test. I have 2 Gb RAM, now 1.16 is free. Vista-64, Service Pack 1

int
main(int argc, char** argv)
{
unsigned char h_idata = NULL;
CUDA_SAFE_CALL( cudaMallocHost( (void*)&h_idata, 256 * (1<<20) ) );
}

Tests fails. For 128 it works.

Can anybody explain what is wrong ?

alex_dubinsky · December 3, 2008, 6:55pm

I’ve tried your code. Same behavior. However, there is apparently more to it. The first call to cudaMallocHost can only allocate 128MB. The second call can allocate up to 256MB, and you can keep allocating 256MB pieces until there’s no more free ram. Didn’t try to crash any programs, but i’m sure a few would eventually. Fun fact: according to Task Manager, physical RAM doesn’t get allocated until you touch it.

[codebox]

#include “cutil.h”

#include <stdlib.h>

#include <stdio.h>

#include <cuda_runtime.h>

int

main(int argc, char** argv)

{

unsigned *h_idata1, *h_idata2, *h_idata3;

printf(“Press return…”);getchar();

CUDA_SAFE_CALL( cudaMallocHost( (void**)&h_idata1, 128 * (1<<20) ) );

for(int i= 0; i< 128*(1<<20)/sizeof(unsigned); i++)

h_idata1[i] = i;

for(int i= 0; i< 128*(1<<20)/sizeof(unsigned); i++)

if(h_idata1[i] != i) printf("DATA MISMATCH\n");

printf(“Press return…”);getchar();

CUDA_SAFE_CALL( cudaMallocHost( (void**)&h_idata2, 256 * (1<<20) ) );

for(int i= 0; i< 256*(1<<20)/sizeof(unsigned); i++)

h_idata2[i] = i;

for(int i= 0; i< 256*(1<<20)/sizeof(unsigned); i++)

if(h_idata2[i] != i) printf("DATA MISMATCH\n");

do{

printf("Press return...");getchar();

CUDA_SAFE_CALL( cudaMallocHost( (void**)&h_idata3, 256 * (1<<20) ) );

for(int i= 0; i< 256*(1<<20)/sizeof(unsigned); i++)

	h_idata3[i] = i;

for(int i= 0; i< 256*(1<<20)/sizeof(unsigned); i++)

	if(h_idata3[i] != i) printf("DATA MISMATCH\n");

}while(1);

printf(“Press return…”);getchar();

}

[/codebox]

dvorkinp · December 4, 2008, 7:23am

Unfortuantely your solution doesn’t help. Yes, I have no more errors with cudaMallocHost but now cudaMemcpy fails.

[codebox]

include “cutil.h”

include <stdlib.h>

include <stdio.h>

include <cuda_runtime.h>

int main(int argc, char** argv)

{

#define SIZE 128*(1<<20)

#define SIZE2 128*(1<<20)

unsigned *h_idata1, *h_idata2, *h_idata3;

unsigned *d_idata1, *d_idata2, *d_idata3;

printf("Press return...");getchar();

CUDA_SAFE_CALL( cudaMallocHost( (void**)&h_idata1, SIZE ) );

CUDA_SAFE_CALL( cudaMalloc( (void**)&d_idata1, SIZE ) );

for(int i= 0; i< SIZE/sizeof(unsigned); i++)	

	h_idata1[i] = i;

CUDA_SAFE_CALL( cudaMemcpy(d_idata1, h_idata1, SIZE , cudaMemcpyHostToDevice) );

for(int i= 0; i< SIZE/sizeof(unsigned); i++)	

	if(h_idata1[i] != i) printf("DATA MISMATCH\n");

printf("Press return...");

getchar();

CUDA_SAFE_CALL( cudaMallocHost( (void**)&h_idata2, SIZE2 ) );

CUDA_SAFE_CALL( cudaMalloc( (void**)&d_idata2, SIZE2 ) );

for(int i= 0; i< SIZE2/sizeof(unsigned); i++)	

	h_idata2[i] = i;

CUDA_SAFE_CALL( cudaMemcpy(d_idata2, h_idata2, SIZE2 , cudaMemcpyHostToDevice) );

for(int i= 0; i< SIZE2/sizeof(unsigned); i++)	

	if(h_idata2[i] != i) 

		printf("DATA MISMATCH\n");

do

{	

	printf("Press return...");getchar();	

	CUDA_SAFE_CALL( cudaMallocHost( (void**)&h_idata3, SIZE2 ) );	

	CUDA_SAFE_CALL( cudaMalloc( (void**)&d_idata3, SIZE2 ) );

	for(int i= 0; i< SIZE2/sizeof(unsigned); i++)		

		h_idata3[i] = i;	

	CUDA_SAFE_CALL( cudaMemcpy(d_idata3, h_idata3, SIZE2 , cudaMemcpyHostToDevice) );

	for(int i= 0; i< SIZE2/sizeof(unsigned); i++)		

		if(h_idata3[i] != i) printf("DATA MISMATCH\n");

}

while(1);

printf("Press return...");

getchar();

}

[/codebox]

alex_dubinsky · December 4, 2008, 4:45pm

It’s true. By why would cudaMemcpy() return “out of memory”?

It’s about time NVIDIA took a look at it.

Topic		Replies	Views
Max amount of host pinned memory available for allocation CUDA Programming and Performance	8	8373	February 4, 2021
Unexpected limit in cudaHostAlloc Failing to allocate large amounts of pinned/page-locked memory CUDA Programming and Performance	3	4136	December 6, 2010
estimate an upper limit for pinned memory (windows, linux) - how ? CUDA Programming and Performance	4	1687	September 5, 2017
Pinned memory limit CUDA Programming and Performance	16	13690	May 1, 2016
Change limit of 50% for cudaHostAlloc pinned memory on Windows 10/11 CUDA Programming and Performance	9	3145	September 19, 2022
Arbitrary Device Limit On Pinned Host Memory CUDA Programming and Performance	8	2089	August 26, 2014
Problem with cudaMallocHost CUDA Programming and Performance	3	7933	April 23, 2009
Out Of Memory Error Allocating large chunks (> 1GB) of pinned-memory fails CUDA Programming and Performance	3	5888	June 4, 2011
cudaHostRegister(): strange/unexpected behaviour under Windows 10 CUDA Programming and Performance	4	1133	October 22, 2019
Significant decrease of available page-locked memory at Win7 x64 vs. Win7 x32 CUDA Programming and Performance	3	5702	June 18, 2011

amount of pinned memory

Related topics