CL_DEVICE_MAX_MEM_ALLOC_SIZE Incorrect?

Keldor314 · October 24, 2010, 11:27am

My GTX 295 reports about 220 MB for CL_DEVICE_MAX_MEM_ALLOC_SIZE, which is in line with the OpenCL spec’s minimum value of 1/4th CL_DEVICE_GLOBAL_MEM_SIZE. However, in practice I’ve been able to allocate and use buffers as large as 512MB, more than twice the stated maximum, with correct results.

What’s up with this?

Also, is there any way to determine how much memory can be allocated and be resident on the device all at once? Since my algorithm is a global scatter, the only way to break it into segments is to rerun the entire program for each segment, discarding/clipping all points that do not fall into the segment currently in memory. Thus, splitting the problem into the fewest segments such that a given segment fits into memory is critical to my program’s performance for large problem sizes.

Keldor314 · October 24, 2010, 11:27am

My GTX 295 reports about 220 MB for CL_DEVICE_MAX_MEM_ALLOC_SIZE, which is in line with the OpenCL spec’s minimum value of 1/4th CL_DEVICE_GLOBAL_MEM_SIZE. However, in practice I’ve been able to allocate and use buffers as large as 512MB, more than twice the stated maximum, with correct results.

What’s up with this?

Also, is there any way to determine how much memory can be allocated and be resident on the device all at once? Since my algorithm is a global scatter, the only way to break it into segments is to rerun the entire program for each segment, discarding/clipping all points that do not fall into the segment currently in memory. Thus, splitting the problem into the fewest segments such that a given segment fits into memory is critical to my program’s performance for large problem sizes.

Martin_Nilsson · October 27, 2010, 7:06am

I’ve seen the overly pessimistic reporting of MAX_MEM_ALLOC_SIZE as well and wondered if this is a bug or I’ve we’re just been lucky so far and that such large allocations may fail at any time in the future.

Martin_Nilsson · October 27, 2010, 7:06am

I’ve seen the overly pessimistic reporting of MAX_MEM_ALLOC_SIZE as well and wondered if this is a bug or I’ve we’re just been lucky so far and that such large allocations may fail at any time in the future.

behuber · November 7, 2010, 3:02pm

I seem to remember reading somewhere that the G200 series of GPUs implements caching similar to that of a CPU. It could be that most of your 512 MB buffer is actually stored in system RAM, but only 220 MB at most is being cached in the VRAM at any given time. If this is the case, it probably won’t hurt anything but could lead to performance issues for random access (sequential access is probably ok).

behuber · November 7, 2010, 3:02pm

I seem to remember reading somewhere that the G200 series of GPUs implements caching similar to that of a CPU. It could be that most of your 512 MB buffer is actually stored in system RAM, but only 220 MB at most is being cached in the VRAM at any given time. If this is the case, it probably won’t hurt anything but could lead to performance issues for random access (sequential access is probably ok).

Martin_Nilsson · November 8, 2010, 8:51am

I don’t think that is the case. My card has a GiB of VRAM and I have not noticed any sudden reductions in performance when allocating more than the stated maximum, even with somewhat random accesses.

Martin_Nilsson · November 8, 2010, 8:51am

I don’t think that is the case. My card has a GiB of VRAM and I have not noticed any sudden reductions in performance when allocating more than the stated maximum, even with somewhat random accesses.

Mr_Nuke · November 8, 2010, 10:42am

I would call it an NVIDIA-problem, but I don’t have enough data. I wonder what ATI cards report.

Anyway, I had s similar thread a while back, where I was complaining that that CUDA allocations would fail above a certain threshold, which at the time appeared to be the CL_DEVICE_MAX_MEM_ALLOC_SIZE. tmurray classified it as a WDDM limitation, and we all agreed, case closed. Later, I started having the same memory allocation problem on Linux.

I have an algorithm that needs two big cunks of data: size 2n and size n. If the maximum safe allocation is 1/4 of the total memory, then I can only use at most 3/8 of the total memory. I think it’s just a mistake on NVIDIA’s side, where they took the OpenCL spec too literally, and just return 1/4 of the total device memory. (just a guess).

Mr_Nuke · November 8, 2010, 10:42am

I would call it an NVIDIA-problem, but I don’t have enough data. I wonder what ATI cards report.

Anyway, I had s similar thread a while back, where I was complaining that that CUDA allocations would fail above a certain threshold, which at the time appeared to be the CL_DEVICE_MAX_MEM_ALLOC_SIZE. tmurray classified it as a WDDM limitation, and we all agreed, case closed. Later, I started having the same memory allocation problem on Linux.

I have an algorithm that needs two big cunks of data: size 2n and size n. If the maximum safe allocation is 1/4 of the total memory, then I can only use at most 3/8 of the total memory. I think it’s just a mistake on NVIDIA’s side, where they took the OpenCL spec too literally, and just return 1/4 of the total device memory. (just a guess).

ajk · June 6, 2011, 1:05pm

According to OpenCL specification the minimum value for CL_DEVICE_MAX_MEM_ALLOC_SIZE is max(1/4th of CL_DEVICE_GLOBAL_MEM_SIZE, 12810241024). So it can be more than 1/4th of the total memory size, but cannot be less. Perhaps NVIDIA just misread the specification…

Topic		Replies	Views
why is CL_DEVICE_MAX_MEM_ALLOC_SIZE never larger than 25% of CL_DEVICE_GLOBAL_MEM_SIZE only on NVIDIA? CUDA Programming and Performance	11	12276	October 27, 2017
Splitting large datasets to fit into device memory Algorithm, implementation, and problems CUDA Programming and Performance	1	7412	April 25, 2009
Memory Allocation Behavior CUDA Programming and Performance	1	5445	November 10, 2010
opencl 6GB memory problem get error message at 4.2GB of memory CUDA Programming and Performance	20	10874	October 27, 2015
bug in memory allocation? CUDA Programming and Performance	6	4148	May 24, 2012
GPU Allocating memory Memory allocation on GPU CUDA Programming and Performance	2	4638	April 23, 2009
How to handle CL_MEM_OBJECT_ALLOCATION_FAILURE errors if amount of useable memory is not known? CUDA Programming and Performance	8	15394	October 9, 2017
Limit on cublasAlloc? CUDA Programming and Performance	16	10708	October 2, 2010
Cannot allocate "all" memory? cudaMalloc fails with 50MB memory left.. CUDA Programming and Performance	9	9569	July 15, 2008
[980 Ti, Windows 10, CUDA 7.5] Out of memory after allocating 4.5 out of 6gb CUDA Programming and Performance	7	5105	December 6, 2015

CL_DEVICE_MAX_MEM_ALLOC_SIZE Incorrect?

Related topics