cudaMalloc fails on huge allocation

Joachimvdh · March 27, 2011, 2:16pm

Hello everyone.
I have run into allocation problems with my program. I have a Geforce GT 8800. NVIDIA X Server settings window tells me I have 512mb onboard memory, which is correct. I installed 260.19.26 development drivers and I use Cuda toolkit 3.2. My application requires me to perform least squares on huge matrices of floats. For that I use the CULA library.

The current example I’m working with results in a matrix with 44 columns. As soon as I pass the number of about 1000000 rows, the program fails. I inserted asserts and they result in the following:
mem::Linear::Linear(const T*, size_t) [with T = float, size_t = long unsigned int]: Assertion `cudaMalloc(&_dev,_size*sizeof(T)) == cudaSuccess’ failed.

Linear is a template class I made as wrapper for linear memory on GPU. 44 x 1000000 x 4 = 176000000, which is only 176 mb? (assuming a float is 4 bytes).
What am I doing wrong? I have the feeling not all my memory is being used…

seibert · March 27, 2011, 3:42pm

CUDA needs to use some of the device memory for its own purposes, although not so much that it would explain the problem you are having. At the beginning of your program, you should call cudaGetMemInfo() to determine how much available memory there is. I’m curious if the amount of device memory used by the driver is abnormally high for some reason.

Joachimvdh · March 28, 2011, 12:31am

Hey seibert,
Thank you for the information, I did this and got the following: (free, free in mb, total, total in mb, usage in mb)
294019072 280.398 536150016 511.312 230.914

I then removed the initialisations for cublas and cula, and got the following:
297054208 283.293 536150016 511.312 228.02
So about half of my gpu memory is filled with something… Could it be because I am running KDE or something? I must be doing something wrong for sure.
(btw I just rebooted before the test so it would be right after the start)Â£
I rebooted again to assure it wasn’t caching the libraries, and it was even worse:
265269248 252.98 536150016 511.312 258.332

So I did another test, and after reading the matrix with 1000000 rows:
108274944 103.259 536150016 511.312 408.053
I then tried 1100000. Top usage was:
85361920 81.4075 536150016 511.312 429.905

So the allocation fails due to filled memory for sure. Any ideas why the memory is already filled 50% at startup?

Thanks for the help.

seibert · March 28, 2011, 12:41am

Hmm, can you try stopping X and running your memory test from the console? I have no idea how much device memory a GUI desktop requires.

I had in mind that CUDA took ~150 MB of memory from the device, but I haven’t checked if that has changed significantly with new releases. (It’s been a while since I’ve run code seriously on our 8800 GT. I should figure out which compute node it has wandered off to and check this…)

Joachimvdh · March 28, 2011, 3:03pm

Ok I ran the application in runlevel 2 and it used only 29mb at the start. I guess I have my answer.

Thanks,

Joachim

Topic		Replies	Views
Cannot allocate "all" memory? cudaMalloc fails with 50MB memory left.. CUDA Programming and Performance	9	9605	July 15, 2008
cudaMalloc() is returning cudaErrorMemoryAllocation. what could be the reasons? CUDA Programming and Performance	5	12974	August 13, 2009
help-Device memory allocation failed CUDA Programming and Performance	1	1237	March 5, 2009
cudaMallocArray returning invalid error code? Says success but there is no memory left. CUDA Programming and Performance	0	1834	November 16, 2009
Cuda Out of Memory with tons of memory left? CUDA Programming and Performance	5	39023	December 23, 2009
cudaMalloc failed with unknown error after only 491656bytes CUDA Programming and Performance	9	4387	July 2, 2009
cudaMalloc() CUDA Programming and Performance	0	838	October 9, 2013
How to solve memory allocation problem in cuda?? CUDA Programming and Performance	4	30957	February 2, 2015
Maximum memory allocation size CUDA Programming and Performance	7	16709	January 24, 2012
using cudaMalloc and cudaFree within a loop unspecified launch failure! CUDA Programming and Performance	21	37702	April 23, 2009

cudaMalloc fails on huge allocation

Related topics