Maximal allocatable memory block 1.7 GB is the limit?


I have a simple application allocating memory blocks of the size 4096 x 1024 x N (I tried both cudaMalloc3D). Around N = 425 I get “out of memory” errors (running on a 4 GB Tesla S1070). Using cudaMallocPitch I encountered a similar limit around 1.7 GB with 2D arrays. Is this a known limitation? Anyone managed to alloc larger blocks? Is there any information about that limitation?

Thanks && kind regards

Edit: I forgot to mention, I’m running on Windows Server 2008 with CUDA 2.2

It is a Windows Vista “feature”. From the release notes:

o The maximum size of a single allocation created by cudaMalloc
or cuMemAlloc is limited to:
MIN ( ( System Memory Size in MB - 512 MB ) / 2, PAGING_BUFFER_SEGMENT_SIZE )
For Vista, PAGING_BUFFER_SEGMENT_SIZE is approximately 2GB.

Windows Server 2008 is Vista based.
On Linux ( and maybe Windows XP) , this limitation is not present,

Thanks a lot! Unfortunately I’m stuck with that OS… :/

After testing of CUDA based application (checked Jacket operation & performance under Windows OS and CentOS) I confirm that any version of Windows, starting with Vista (Vista, Server 2008, Windows 7, Server 2008 R2), is affected by this limitation. Windows XP 64-bit and CentOS 5.4 64-bit are not affected by PAGING_BUFFER_SEGMENT_SIZE of approximately 2GB and can use all memory of Tesla C1060, i.e. 4GB.

Guess…Intel stuck a deal with MS? :)