how to get more global memory available

Hello community,

I wouldn’t mind if we would talk about ~10Mb due to numerous allocations for local memory, stack memory, constant memory, instruction RAM, malloc heap, printf heap, by the driver, but I had to realise, that more then the half of my global memory is preoccupied.

In the beginning of my application I’m asking for the amount of free memory:

size_t free, total;

I’m receiving for total memory 536870912 Bytes, but for free memory only 218918912 Bytes.

I have to mention that this my only device (Quadro 410) and I using it for displaying purposes as well. So I’m not sure about if I have to accept a lack of available memory in the case I’m going to use the graphic card for display purposes too.

It would be great if someone would answer my question, giving me a short explanation.

By the way I’m using VS2010 x64 and Cuda 5.0; the Quadro 410 is a comp. cap. 3.0

many tnx in advance

cheers greg

There are a couple of parameters you can tune before launching the first kernel to reduce memory consumption.

Hi tera,

I came across with these parameters already.
The default values are:
cudaLimitStackSize == 1.024
cudaLimitPrintfFifoSize == 1.048.576
cudaLimitMallocHeapSize == 8.388.608
cudaLimitDevRuntimeSyncDepth == not supported at comp cab. 3.0
cudaLimitDevRuntimePendingLaunchCount == not supported at comp cab. 3.0

the progamming guide says:
the cudaLimitStackSize controls the stack size in bytes of each GPU thread.
The max. number of threads per microprocessor at comp. cap. 3.0 is 2.048 and the Quadtro410 has only one, so it should be a consumption of 1.024 x 2.048 = 2.097.152 bytes
All together it’s 2.097.152 + 1.048.576 + 8.388.608 = 11.534.336

So there is still a lack of 306.417.652 bytes.
Even if there are some extra bytes used by cudaLimitDevRuntimeSyncDepth & cudaLimitDevRuntimePendingLaunchCount the would be still to huge.

Please correct me if I’m wrong with my calculation

The rest of the memory is almost certainly used for your display and some is required for the CUDA driver. My GTX 650M runs my display, and it typically only has 600 MB of memory free for CUDA programs.

Thank you seibert,

I assume the best solution would be to have a separate card.


found an GTS 450 in my office, which I plugged in aside the Quadro410.
here the results:

total memory = 1.073.741.824
free memory = 978.780.160

I would say, much better!

tnx a lot for the advices