Unified memory and overprovisioning

iordan.bogdan101 · February 20, 2022, 12:34pm

Hello, I am trying to understand something about unified memory, I was under the impression that allocating and working on data sets beyond the size of the device memory is possible but that doesn’t seem to be the case.

For example if I try N = 8192 * 5 in the code below, which would work out to an array of size ~6.7 GB that is within the VRAM capacity of my GPU (GTX 1080) everything works fine but if I try for instance N = 8192 * 6 which is about ~9.6 GB, so over the 8 GB of VRAM my card has, the allocation function starts returning out of memory errors.

	size_t N = 8192*5;
	float *A;

	cudaMallocManaged(&A, size_t(N * N * sizeof(float)));

	for (size_t i = 0; i < N * N; i++)
	{
		A[i] = 1.0f;
	}

	cudaFree(A);

Am I misunderstanding something ? I should mention that I use CUDA 9.2 because I want to work with cuBLAS in device code and apparently support for that was dropped in version 10.0.

striker159 · February 20, 2022, 5:32pm

Did you verify that oversubscription is supported on your GPU?

https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__MEMORY.html#group__CUDART__MEMORY_1gd228014f19cc0975ebe3e0dd2af6dd1b

Device memory oversubscription is possible for GPUs that have a non-zero value for the device attribute cudaDevAttrConcurrentManagedAccess. Managed memory on such GPUs may be evicted from device memory to host memory at any time by the Unified Memory driver in order to make room for other allocations.

rs277 · February 20, 2022, 6:31pm

It would seem that the GTX1080 does support oversubscription, based on reply 6 in this thread:

rs277 · February 20, 2022, 7:31pm

Are you using Linux or Windows? Oversubscription is not available on Windows:

https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#um-requirements

iordan.bogdan101 · February 20, 2022, 7:41pm

Yes, I’m on windows, so that must be it. Thanks.

system · March 6, 2022, 7:42pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Unified memory (cudaMallocManaged) unable to oversubscribe GPU memory on sm_60, Telsa P100 CUDA Programming and Performance	23	3335	June 25, 2017
Using unified memory in GTX 1070 with CUDA 8 CUDA Programming and Performance	4	1563	October 29, 2016
Unified Memory on W10 Issue - Cannot Alloc More than VRAM Size CUDA	1	428	March 18, 2020
Unified memory oversubscription and page faults CUDA Programming and Performance	7	2958	March 23, 2018
Maximum matrix size for matrix multiplication operation on GeForce GTX 960M CUDA Programming and Performance	12	3934	November 28, 2018
CUDA 6.5 Unified Memory (cudamallocmanaged) CUDA Programming and Performance	1	2218	February 18, 2015
Unified Memory Limits? CUDA on Windows Subsystem for Linux	7	4092	July 6, 2022
Running out of global memory CUDA Programming and Performance	9	2454	December 10, 2021
Improving GPU Memory Oversubscription Performance Technical Blog	5	951	July 16, 2025
CudaMallocmanaged() can not exceed more than 65410 iterartions CUDA Programming and Performance	1	622	July 13, 2016

Unified memory and overprovisioning

Related topics