Virtual memory management minimum granularity

woosungkang · October 7, 2023, 10:49am

Hello,

I have a question about the minimal granularity when trying to allocate GPU memory by cuMemcreate.

With my system, which is 3090 with CUDA 12,
I got the minimal allocation granularity of 2MB when I query with cuMemGetAllocationGranularity.
I wonder where this 2MB came from.

I suspect this might be the GPU huge-page size, but why isn’t the normal GPU page size such as 4KB?

Is there any way we can change this minimal allocation granularity?

Thxs

t-rprabhu · March 22, 2024, 7:31am

Hey, I’m also curious about the same. have you found out anything about it?

woosungkang · March 22, 2024, 7:51am

Hi @t-rprabhu

First of all, I couldn’t find any official comments on this question. However, I did come across some articles explaining this topic.

Inside the implementation of PyTorch, which is one of the state-of-the-art machine learning frameworks, there’s a component that utilizes CUDA Virtual Memory Management APIs.

This part of the code explains the CUDA VMM as follows:

“When we allocate a new segment, we allocate enough address space to map essentially the entire physical memory of the GPU (which is 256TiB of address space). However, we only map as much physical memory as is needed by the program at the moment. As more memory is requested, we add more physical memory to the segment. This can work at the granularity of GPU pages, which are currently 2MiB.”

I believe that 2MiB is the current page size for NVIDIA GPUs and manipulating memory at a granularity smaller than 2MiB is not possible at this time.

I hope this information is helpful.

source of PyTorch: pytorch/c10/cuda/CUDACachingAllocator.cpp at main · pytorch/pytorch · GitHub

Thanks.

t-rprabhu · March 23, 2024, 1:42am

@woosungkang Thanks for speedy and helpful reply!

wangw42 · August 20, 2024, 7:57pm

Hi, I have a question about allocation, for example, we have data a, and b, both are much smaller than the page size (2MiB), will these data be allocated to one same page, or will they occupy two separate 2MiB pages?
Thanks.

Robert_Crovella · August 20, 2024, 9:45pm

a page, mapped for one allocation, cannot also be mapped for another allocation.

Topic		Replies	Views
CUDA Memory allocation size preference CUDA Programming and Performance hw , cuda , pytorch	0	897	January 21, 2022
About cudaMallocManaged() granularity CUDA Programming and Performance	2	481	April 14, 2023
How big is minimum assigned size in cudaMalloc? CUDA Programming and Performance	6	2172	January 17, 2018
cudaMallocManaged allocating more memory than requested CUDA Programming and Performance	7	3266	July 13, 2018
Question about GPU Memory Overhead with Cudamallocmanaged CUDA Programming and Performance	7	1049	August 21, 2024
CUDAMalloc question CUDA Programming and Performance	10	803	May 23, 2019
cudaHostAlloc can only allocate about 3.5GB of memory out of 128GB CUDA Programming and Performance	7	498	June 2, 2023
memory allocation problem CUDA Programming and Performance	2	4817	September 8, 2009
cudaHostAlloc question CUDA Programming and Performance	3	632	August 29, 2019
Allocating pinned memory with large RAM configurations CUDA Programming and Performance cuda , python	3	88	January 7, 2025

Virtual memory management minimum granularity

Related topics