Memory allocation reliablity

KevinK13 · August 13, 2008, 4:18pm

Hi hopefully someone can explain my problem.

We allocate two large blocks of memory, use a kernel to process the data and then free the memory. This cycle is repeated continously for hours at a time and we have been getting after varying numbers of iterations ‘unable to allocate memory’ error. After much testing we have simplified the code to the attached memAlloc.cpp which just allocates and then frees blocks of memory with no GPU kernel processing.

The pseudo code for the attached file is as follows

Set block size to 140 Mbytes
// allocate a large block of memory to ensure incresing block size is possible
Allocate a single 420 Mbyte block of gpu memory
Free the 420 Mbyte block of gpu memory

loop a 1000 times or stop on error
Allocate a block of memory - block1
Allocate a second block of memory - block2
Free block 1
Free block 2
increase the size of the block by 320 bytes

The memory allocation will fail between 500 and 800 iterations

The same program without changing the size of the blocks allocated will run many minutes ( forever ? ). A single block, fixed or changing size will run for many minutes ( forever? ).

GPUs Tested 8800GTX 9800GTX GTX260
Driver 177.35
Windows XP Professional Service Pack 2 ( 32bit )
Intel Quad core PC
AMD Dual core PCs
memAlloc.cpp (2.2 KB)

cbuchner1 · August 13, 2008, 6:12pm

heap fragmentation!

pfccpp · August 13, 2008, 6:29pm

I’ve tested your code and works without problem on my PC:

8800 GTX
CUDA 2.0b2
Driver 177.83

KevinK13 · August 14, 2008, 10:47am

Thanks for testing.

I will get and try with driver 177.83. I already use CUDA 2.0b2

iceberg · August 14, 2008, 11:59am

Hi hopefully someone can explain my problem.

We allocate two large blocks of memory, use a kernel to process the data and then free the memory. This cycle is repeated continously for hours at a time and we have been getting after varying numbers of iterations ‘unable to allocate memory’ error. After much testing we have simplified the code to the attached memAlloc.cpp which just allocates and then frees blocks of memory with no GPU kernel processing.

The pseudo code for the attached file is as follows

Set block size to 140 Mbytes

// allocate a large block of memory to ensure incresing block size is possible

Allocate a single 420 Mbyte block of gpu memory

Free the 420 Mbyte block of gpu memory

loop a 1000 times or stop on error

Allocate a block of memory - block1

Allocate a second block of memory - block2

Free block 1

Free block 2

increase the size of the block by 320 bytes

The memory allocation will fail between 500 and 800 iterations

The same program without changing the size of the blocks allocated will run many minutes ( forever ? ). A single block, fixed or changing size will run for many minutes ( forever? ).

GPUs Tested 8800GTX 9800GTX GTX260

Driver 177.35

Windows XP Professional Service Pack 2 ( 32bit )

Intel Quad core PC

AMD Dual core PCs

[snapback]425031[/snapback]

I have tested your application. I ran into the same problem. The application output as following:

Successful allocate 410156 KBytes memory

Successful free memory

Start loop allocating 2 memory blocks of 136718 KBytes and then free the blocks

The size of the blocks is increased by 320 for each iteration

loop count 100 size 136750 KBytes

loop count 200 size 136781 KBytes

loop count 300 size 136812 KBytes

loop count 400 size 136843 KBytes

loop count 500 size 136875 KBytes

loop count 600 size 136906 KBytes

Failed on allocating GPU memory gpudata1. cudaError message: unknown error

Exception error. Iteration 606

Press ENTER to exit...

GTX260

CUDA 2.0Beta2

VS2005

Windows XP Professional SP2

Xeon E5410

Memory 4GB

theMarix · August 14, 2008, 12:23pm

The applications segfaults for me:

gdb ../../bin/linux/release/seCudaMemSmoke

GNU gdb 6.6-debian

Copyright (C) 2006 Free Software Foundation, Inc.

GDB is free software, covered by the GNU General Public License, and you are

welcome to change it and/or distribute copies of it under certain conditions.

Type "show copying" to see the conditions.

There is absolutely no warranty for GDB.  Type "show warranty" for details.

This GDB was configured as "x86_64-linux-gnu"...

Using host libthread_db library "/lib/libthread_db.so.1".

(gdb) run

Starting program: /afs/kip.uni-heidelberg.de/user/mbach2/gpu-dev00/NVIDIA/cudaSDK/bin/linux/release/seCudaMemSmoke

[Thread debugging using libthread_db enabled]

[New Thread 47798758420832 (LWP 28537)]

Successful allocate 410156 KBytes memory

Successful free memory

Start loop allocating 2 memory blocks of 136718 KBytes and then free the blocks

The size of the blocks is increased by 320 for each iteration

loop count 100 size 136750 KBytes

loop count 200 size 136781 KBytes

loop count 300 size 136812 KBytes

loop count 400 size 136843 KBytes

loop count 500 size 136875 KBytes

loop count 600 size 136906 KBytes

Program received signal SIGSEGV, Segmentation fault.

[Switching to Thread 47798758420832 (LWP 28537)]

0x00002b790407d483 in ?? () from /usr/lib/libcuda.so

(gdb) backtrace

#0  0x00002b790407d483 in ?? () from /usr/lib/libcuda.so

#1  0x00002b7904070a23 in ?? () from /usr/lib/libcuda.so

#2  0x00002b79040650d9 in ?? () from /usr/lib/libcuda.so

#3  0x00002b7902ce370b in cudaMalloc () from /usr/local/cuda/lib/libcudart.so.2

#4  0x0000000000400c91 in main ()

(gdb)

KevinK13 · August 15, 2008, 9:58am

Thanks for those who tested this problem.

I have now installed driver 177.83 and the test loop will now run for 10000+ iterations without failure.

GTX 260 8800GTX CUDA 2.0b2 Driver 177.83

Mu-Chi_Sung · August 15, 2008, 7:34pm

IMHO, when you have frequent allocations/deallocations, it’s better write your own allocator instead of using cudaMalloc/cudaFree. I wrote my own allocator to avoid fragmentation, and also speed up allocations/deallocations by large factor.

KevinK13 · August 18, 2008, 8:31am

Thanks we have implemented are own memory management and all is well :-)

Topic		Replies	Views
Memory fragmentation after allocation of small block of memory CUDA Programming and Performance	2	752	June 22, 2015
bug in memory allocation? CUDA Programming and Performance	6	4254	May 24, 2012
using cudaMalloc and cudaFree within a loop unspecified launch failure! CUDA Programming and Performance	21	37980	April 23, 2009
how to effectively free large memory allocation CUDA Programming and Performance	8	7954	November 5, 2015
memory allocation problem CUDA Programming and Performance	2	4871	September 8, 2009
cudaMalloc segfaulting Possible cause? CUDA Programming and Performance	7	4138	September 26, 2008
How to solve memory allocation problem in cuda?? CUDA Programming and Performance	4	31560	February 2, 2015
cudaFree isn't cleaning global memory CUDA Programming and Performance	12	3808	June 29, 2010
Question about GPU Memory Fragmentation CUDA Programming and Performance cuda	6	536	September 16, 2025
cudaFree does not free memory on Kepler CUDA Programming and Performance	2	2350	June 20, 2012

Memory allocation reliablity

Related topics