How can I set a malloc heap size greater than 4GB?

const int malloc_limit = 4608*1024*1024;
error = cuCtxSetLimit(CU_LIMIT_MALLOC_HEAP_SIZE, malloc_limit);
if (error != CUDA_SUCCESS) Cleanup(false);

I tried this code to allocate 4.5GB heap size for malloc. Obviously, this won’t work because int is -2GB to 2GB. So how can I allocate 4.5GB? Is there a new flag to set it in KB or MB?

Thanks in advance!

The second parameter in cuCtxSetLimit() is a size_t which is 64-bit under 64-bit systems.

Thank you very much for your quick reply.

I changed const int to const size_t but I still get this warning for the “const size_t malloc_limit = 460810241024;” line

warning: integer overflow in expression [-Woverflow]

What to do?

You should have been able to find a solution for that warning message by searching the web.

These should both get rid of the warning under 64-bit systems:
const size_t malloc_limit = size_t(4608) * size_t(1024) * size_t(1024);
const size_t malloc_limit = 4608ull * 1024ull * 1024ull;

Thank you very much for your reply. I added a (size_t) cast in front of the integers and it could compile without warning.

But I still couldn’t allocate more than 4GB heap size in practice. I run nvidia-smi along with my run. I noticed that no matter what I set above 409610241024, VRAM usage remains at 4192/6143MiB, If I set it below 409610241024, the number would change accordingly at nvidia-smi. What can I do?

Here’s an example for host-side allocation. Device-side allocation should be similar.

It works fine on all my devices including a K20c with 5GB of mem.

Note that you need to compile with 64-bit if you plan on allocating more than 4GB.

Thank you for your reply. I compiled with -m64 but nothing changed.

I am malloc’ing 4.8GB inside my device ptx code. Maybe there is a limit there?

What exactly is your system configuration?
OS, version, bitness, graphics driver version, GPU(s), VRAM.

Ubuntu 12.04 64-bit but I am running a 3.5.0-46 kernel

My driver 334.16 which is a beta one. I am using it because the latest stable one doesn’t support my Titan Black 6GB

CUDA 5.5.22 Toolkit

My legacy code was written for GTX 470 1.25GB VRAM. I am trying to port it to my new card.

Thanks a lot in advance!

Looks like it is not possible to set heap size bigger than 4GB. I ran the following code

const size_t malloc_limit = (size_t) 4608 * 1024 * 1024;
error = cuCtxSetLimit(CU_LIMIT_MALLOC_HEAP_SIZE, malloc_limit);
if (error != CUDA_SUCCESS) Cleanup(false);
size_t malloced = 0;
error = cuCtxGetLimit(&malloced, CU_LIMIT_MALLOC_HEAP_SIZE);
if (error != CUDA_SUCCESS) Cleanup(false);
printf("%lu\n", malloc_limit);
printf("%lu\n", malloced);

And got a print out of this:


Can someone give me an official answer that this is possible or not? Thanks a lot.

I checked with the CUDA driver team. They confirmed that a hard 4 GB limit is currently in place. A bug has been filed. Sorry for the inconvenience and thank you for making us aware of this issue.

Thanks for your reply. Do I expect the bug to be fixed in the 6.0 Toolkit or an updated 5.5 Toolkit?

If it’s indeed a driver problem or limittion, then a driver update will fix it.

I updated the driver to 337.12 beta but the bug remains. So fix is not there yet?