cudaMalloc Incorrectly Reporting out of Memory Can't allocate more than 1151 MB at a time.

BHC · June 28, 2010, 3:22pm

I am using an Nvidia Tesla C2050 on Windows 7.

Even though my card contains 2651 MB of memory, any time I try to use cudaMalloc to allocate more than 1151 MB at once, I get the error: Runtime API Error: out of memory.

Note that this only happens when I allocate more than 1151 MB using a single cudaMalloc call. If I split my request up into chunks, it works fine. In other words, this returns device out of memory.

cudaMalloc((void**) &ptr, 1152*1024*1024);

However, these two requests work fine.

cudaMalloc((void**) &ptr, 1151*1024*1024);

cudaMalloc((void**) &ptr, 1151*1024*1024);

Is there a maximum amount of device memory that can be allocated at once?

tmurray · June 28, 2010, 4:04pm

Yes, on Windows 7. On your machine, it’s 1151 MB. This is due to the memory layout scheme of WDDM. Some fixes are coming for this in later drivers.

BHC · June 28, 2010, 5:08pm

Thank you for the reply.

Do you know of any acceptable workaround? (Other than distributing my data structure across several, noncontiguous chunks of memory.) For example, if I issue multiple cudaMalloc requests consecutively, and I am the only person using the device, can I then treat the union of those memory blocks as one large memory block?

Alternatively, I can install CentOS. Do you know if this issue effects the Linux drivers?

tmurray · June 28, 2010, 5:17pm

Linux is completely unaffected by the allocation limits. The TCC driver (which you can use with C1060 if you don’t really need decent display output as it can’t coexist with standard NV WDDM devices right now) also is unaffected by the allocation limits.

BHC · June 29, 2010, 1:30pm

Actually I’m not using this card for display at all. But the card is a C2050, not a C1060. Will the TCC driver work with a C2050?

BHC · June 29, 2010, 1:30pm

Actually I’m not using this card for display at all. But the card is a C2050, not a C1060. Will the TCC driver work with a C2050?

ParallelPoint · February 1, 2011, 6:09am

Just for reference I found:

Total system memory available for graphics use

Total amount of system memory that can be dedicated or shared to the GPU, calculated as:

TotalSystemMemoryAvailableForGraphics = MAX((TotalSystemMemory - 512) / 2), 64MB)

–It sounds like the reason they limit a single allocation to the above is because if it were larger than the amount of system memory available for the GPU, then it would be impossible for the page to fit in system memory.–

Using the TCC compute driver bypasses this page requirement and gives you full control of device memory. (At least this is my understanding)

Source:

http://download.microsoft.com/download/9/c/5/9c5b2167-8017-4bae-9fde-d599bac8184a/GraphicsMemory.doc

Topic		Replies	Views
How much GPU memory can cudaMalloc get? CUDA Programming and Performance	17	15708	April 2, 2022
maximum allocation size on windows 7 CUDA Programming and Performance	2	11674	April 1, 2010
cuMemAlloc limited to 1/4 total GPU memory? CUDA Programming and Performance	10	12963	April 1, 2010
cudaMalloc3DArray out of memory can not allocate the available amount of memory CUDA Programming and Performance	3	1879	January 31, 2011
Max cudaMalloc under Windows 7 is there a restriction? CUDA Programming and Performance	1	2007	November 4, 2009
Maximal allocatable memory block 1.7 GB is the limit? CUDA Programming and Performance	4	9865	November 18, 2009
CUDA Memory allocation issue on GTX 850M on Windows 7 (64) CUDA Programming and Performance	5	1496	October 29, 2015
CUDA Memory allocation issue on GTX 850M on Windows 7 (64) CUDA Programming and Performance	0	767	October 28, 2015
cudaMalloc problem on a Titan card CUDA Programming and Performance	2	873	January 7, 2014
bypass wddm restrictions how to allocate memory for a huge 3d texture CUDA Programming and Performance	5	16682	February 1, 2011

cudaMalloc Incorrectly Reporting out of Memory Can't allocate more than 1151 MB at a time.

Related topics