Consumption of host memory increases abnormally

llittlefish · May 12, 2011, 2:58pm

Hi,

I just found that the consumption of host memory seems to increase abnormally in the new CUDA release.
The following is a very simple demo program.

int main()
{
// set GPU ID
cudaSetDevice( 0 );

// allocate host memory
const long NGB = 9;
const long N = NGB10241024*1024/sizeof(float);
float *Array = new float [N];

// arbitrary work on Array
// …

delete Array;
exit(0);
}

The “top” command shows that the demo program consumes about 18.3 GB, which is about 2 times larger than expected!!
However, if i comment out the line “cudaSetDevice( 0 );”, the memory consumption becomes normal (about 9 GB).
The information of the system giving this issue is as follows.

CUDA version: 4.0 RC2
Driver version: 270.41.06
OS: Scientific Linux SL release 5.4 (Boron)

I suspect that this issue is due to the new graphic driver and(or) CUDA since this problem appeared after a system upgrade.
I also tested the demo program in a different system with older versions of CUDA and driver, and everything works fine.
In this case, the memory consumption is normal even when I include any CUDA function.
The information of this system is as follows:

CUDA version: 3.2
Driver version: 260.19.21
OS: CentOS release 5.5 (Final)

Has anyone had a similar problem? Any help is appreciated!

thomasco · May 26, 2011, 1:09pm

I’m pretty sure you’re looking at the total memory column, and this is irrelevant.

The RSS column is what your application really uses, and I don’t think this one will change dramatically with just the addition of cudaSetDevice().

You can try “pmap -x ” to see where your memory is.

Same thing if you mmap a 200GB file (which sits on disk). You would see 200GB in top. That does not mean you’re using 200GB.

-Guillaume

Hi,

I just found that the consumption of host memory seems to increase abnormally in the new CUDA release.

The following is a very simple demo program.

int main()

{

// set GPU ID

cudaSetDevice( 0 );

// allocate host memory

const long NGB = 9;

const long N = NGB10241024*1024/sizeof(float);

float *Array = new float [N];

// arbitrary work on Array

// …

delete Array;

exit(0);

}

The “top” command shows that the demo program consumes about 18.3 GB, which is about 2 times larger than expected!!

However, if i comment out the line “cudaSetDevice( 0 );”, the memory consumption becomes normal (about 9 GB).

The information of the system giving this issue is as follows.

CUDA version: 4.0 RC2

Driver version: 270.41.06

OS: Scientific Linux SL release 5.4 (Boron)

I suspect that this issue is due to the new graphic driver and(or) CUDA since this problem appeared after a system upgrade.

I also tested the demo program in a different system with older versions of CUDA and driver, and everything works fine.

In this case, the memory consumption is normal even when I include any CUDA function.

The information of this system is as follows:

CUDA version: 3.2

Driver version: 260.19.21

OS: CentOS release 5.5 (Final)

Has anyone had a similar problem? Any help is appreciated!

llittlefish · June 1, 2011, 8:55am

Thomasco, thanks a lot for your reply External Image
You are right. Here I refer to the “VIRT” memory shown by the top command.
The actual memory consumption (the RSS column shown by “pmap -x”) is normal.
However, abnormal virtual memory consumption is still an unpleasant property.
Especially, in the system with a limited amount of virtual memory, the program
can crash due to the enormous consumption of virtual memory.

I have verified that the abnormal virtual memory consumption only occurs in CUDA 4.0
(in both RC2 and the production release).
If I reinstall both driver and CUDA toolkit to version 3.2, this issue no longer appears.

Does anyone has any suggestion?

seibert · June 1, 2011, 2:19pm

I also noticed this on my seven GPU testbed with CUDA 4.0. Once I open a CUDA context on a device, the process grabs 24 GB of virtual memory. The actual memory usage is fine, so there is no reason to be worried, but it is a little weird.

Is this related to the Unified Virtual Addressing feature of CUDA 4?

tmurray · June 1, 2011, 9:17pm

this is related to UVA. we have to carve out a chunk of virtual memory equal to the total physical GPU memory, plus the total system memory, plus some small fudge factor for alignment purposes.

we actually throttle back on the UVA region if you run out of virtual memory. this will restrict the amount of memory you can allocate, though.

llittlefish · June 2, 2011, 3:40am

Tmurray, thanks for the explanation External Image
So, do you have any suggestion for the case that the program is terminated due to
the large amount of virtual memory consumption (which exceeds the upper limit
reported by “ulimit”)?
Is there any solution to get rid of this issue?
Otherwise, I will have to try to convince the system administrator to reset the upper limit
of virtual memory to “unlimited”.

Topic		Replies	Views
High virtual memory consumption on Linux for CUDA programs: is it possible to avoid it? CUDA Programming and Performance	4	2440	November 27, 2018
Huge memory leak CUDA Programming and Performance	16	5578	July 27, 2016
Arbitrary Device Limit On Pinned Host Memory CUDA Programming and Performance	8	2047	August 26, 2014
Pinned memory limit CUDA Programming and Performance	16	13251	May 1, 2016
Large memory allocation with CudaHostAlloc fails with CUDA 8.0 release build CUDA Programming and Performance	23	4388	January 29, 2018
cudaMalloc3D - What do I wrong? Do I make false assumptions ... ? CUDA Programming and Performance	12	4848	April 15, 2010
cudaHostAlloc can only allocate about 3.5GB of memory out of 128GB CUDA Programming and Performance	7	397	June 2, 2023
Can a Kernel be too big?? CUDA_ERROR_NO_BINARY_FOR_GPU error 209 CUDA Programming and Performance	11	2958	November 13, 2017
Host memory consumption spike with cudamalloc CUDA Programming and Performance	2	600	September 11, 2017
check for cudaHostAlloc Portable possibility CUDA Programming and Performance	13	2764	July 1, 2015

Consumption of host memory increases abnormally

Related topics