Memory missing

xuchen · April 2, 2011, 9:36am

my cu file only have one line
void *p = NULL:
cudaMalloc(&p, 1);
then i found my gpu memory occupy have come form 15M to 220M

i want to know why , i have 1G memory totally ,so i can only create 4 threads ,otherwise the memory will overflow.

my environment:
windows 2003 64bit , gtx 470 * 2(no sli, and doesn’t support ecc), cudasdk 3.2

thanks

xuchen · April 6, 2011, 3:46pm

Did anyone know?

Hawkie · April 6, 2011, 4:55pm

Err, pretty sure this is not the answer to your problem but isn’t it supposed to be

void* p;

cudaMalloc(&p, 1);

?

tera · April 6, 2011, 5:13pm

The first Cuda call also initializes the context, which uses quite a bit of memory for heap, stack and printf buffer. Subsequent calls to malloc will only take up the memory that is allocated (plus some small wasted space to ensure proper alignment).

xuchen · April 7, 2011, 2:42am

you said just as i hope, but it’s not the real.

code like this:

cudaMalloc(&p, 10010241024)

this will occupy 230 + 100M memory ,not 230M memory,

so there doesn’t have a memory pool.

tera · April 7, 2011, 10:18am

The heap is there to serve future allocations from the device, not from the host. So a second call to cudaMalloc() will not be served from the heap set up while initializing the context during the first cudaMalloc(). The point I wanted to make is that the second call does not reserve another 230Mb.

If you are not using printf() and malloc() on the device and no recursion or deeply nested function calls, you can limit the size of the context by calling My cuCtxSetLimit() before the first cudaMalloc()

xuchen · April 8, 2011, 3:29am

I have do as you said:
cuCtxCreate(&cuContext, 0 , 0);
size_t size = 0;
cuCtxSetLimit(CU_LIMIT_STACK_SIZE , 0);
cuCtxSetLimit(CU_LIMIT_PRINTF_FIFO_SIZE, 0);
cuCtxSetLimit(CU_LIMIT_MALLOC_HEAP_SIZE, 0);

cuCtxGetLimit(&size, CU_LIMIT_STACK_SIZE ); //size = 1K
cuCtxGetLimit(&size, CU_LIMIT_PRINTF_FIFO_SIZE); //size = 800k
cuCtxGetLimit(&size, CU_LIMIT_MALLOC_HEAP_SIZE); //size = 4M
cudaMalloc();
when i called the 3 limits functions , i found the memory usage decreased from 230M to 210M.

then i run the examples from the cuda sdk,
matrixMulDrv and the matrixMulDynlinkJIT,
and i found there is a strange: the call cuCtxCreate on the matrixMulDrv will occupy 200M memroy ,but the call cuCtxCreate on the matrixMulDynlinkJIT will only occupy 100M memory.
i have been every confused.

thanks every much for your help.

xuchen · April 12, 2011, 12:39am

Need your help!

xuchen · April 14, 2011, 3:29am

help

xuchen · April 19, 2011, 12:09pm

did anyone know?

Topic		Replies	Views
CUDA on first memory allocation call only weird issue CUDA Programming and Performance	3	3737	October 11, 2011
Memory on DRAM CUDA Programming and Performance	6	2474	April 28, 2012
Why does cudaMallocHost takes so muck time compared to malloc? CUDA Programming and Performance	9	2134	August 26, 2011
Memory allocation : strange behavior CUDA Programming and Performance	4	2554	March 4, 2008
Cuda Out of Memory with tons of memory left? CUDA Programming and Performance	5	38993	December 23, 2009
cudaMallocPitch is failed while multi GPUs are controlled by separated CPU processes despite the fac... CUDA Programming and Performance	4	762	October 18, 2017
How big is minimum assigned size in cudaMalloc? CUDA Programming and Performance	6	2099	January 17, 2018
Cuda Memory Usage TX1 Jetson TX1	8	4527	December 16, 2015
CudaMalloc is taking huge time for first time, How to overcome this issue CUDA Programming and Performance cuda	1	1045	April 12, 2021
CUDA in-kernel malloc CUDA Programming and Performance	4	9890	July 19, 2011

Memory missing

Related topics