About cudaMallocManaged() granularity

jeongjae · April 14, 2023, 6:38am

Hello.

I have some questions about cudaMallocManaged() granularity. When I allocate unified memory by cudaMallocManaged() which size is under 128MiB, the free memory size is decreased 128MiB (ex. I allocated 16MiB Unified memory, but total global memory size is not 40GiB - 16Mib.) Why is this happening?

Below is my code flow.

GPU: NVIDIA A100 40GB
CUDA driver version: 525.85.12
CUDA version: 12.0

cudaMemgetinfo() // 1
cudaMallocManaged(arr, 16MiB)
init(arr)
cudaMemprefetchAsync(arr, 16MiB)
cudaMemgetinfo() // 2

2-1 is not 16MiB, it is 128MiB

cyunchae · April 14, 2023, 7:55am

I also have this problem with UVM. Waiting for an answer

Robert_Crovella · April 14, 2023, 1:54pm

There is apparently some allocation granularity in CUDA. The details of it are not published. Here are some examples of related questions/discussion.

Topic		Replies	Views
cudaMallocManaged allocating more memory than requested CUDA Programming and Performance	7	3266	July 13, 2018
cudaMallocManaged malloc memory not same as requested CUDA Programming and Performance	7	837	April 17, 2019
sth wierd about managed memory and free GPU memeory size CUDA Programming and Performance	2	662	November 25, 2019
Question about GPU Memory Overhead with Cudamallocmanaged CUDA Programming and Performance	7	1049	August 21, 2024
How big is minimum assigned size in cudaMalloc? CUDA Programming and Performance	6	2172	January 17, 2018
Allocating Class members with cudaMallocManaged CUDA Programming and Performance	4	1265	January 11, 2018
Maximum size of memory block in cudaMallocManaged() CUDA Programming and Performance	7	2584	November 28, 2017
Unified Memory Allocation Alignment on Windows CUDA Programming and Performance	0	563	August 23, 2020
Unified memory (cudaMallocManaged) unable to oversubscribe GPU memory on sm_60, Telsa P100 CUDA Programming and Performance	23	3083	June 25, 2017
Virtual memory management minimum granularity CUDA Programming and Performance cuda	5	1322	August 20, 2024

About cudaMallocManaged() granularity

Related topics