sth wierd about managed memory and free GPU memeory size

Hi all,
I want to fill GPU memory fully. by cudaMallocManaged, I alloc a large memory, and init it in a kernel. but there are still a lot of free GPU memory. any help?
At the begining, there is about 7.5G free memory in GPU. After filling 7.5G data, there still is about 5.5G free memory.
here is my code:

#include <cuda.h>
#include <iostream>
using namespace std;
__global__ void incr(char* a, size_t n)
{
    int tid = blockDim.x * blockIdx.x + threadIdx.x;
    int stride = blockDim.x * gridDim.x;
    for(int i=tid; i<n; i+=stride)
    {
        a[i] = 0;
    }
}

int main()
{
    size_t freeSize, total;
    cudaMemGetInfo(&freeSize, &total);
    cout<<"free:"<<freeSize<<endl;

    char* data;
    cudaMallocManaged((void**) & data, freeSize);
    incr<<<512, 512>>>(data, freeSize);
    cudaDeviceSynchronize();
    
    cudaMemGetInfo(&freeSize, &total);
    cout<<"free:"<<freeSize<<endl;
}

To be certain of an answer, you should probably provide more information:

  1. Operating System
  2. CUDA version
  3. GPU you are running on

All of these things affect managed memory behavior.

However, my guess is you are in a demand-paged, oversubscription environment.

In that case, allocating memory via cudaMallocManaged does not really “use up” GPU memory. Due to demand paging, the GPU can always work on additional data. A managed allocation will migrate to the processor that attempts to use it, so it does not really “use up” the memory of a particular processor. It does not have a logical association with a particular GPU, until that GPU attempts to use it.

If you want to see the reported memory on the GPU get “used up”, you will need to switch to cudaMalloc. (and you will need to change your size somewhat.)

You may wish to read the documentation on managed memory:

https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#um-unified-memory-programming-hd

https://stackoverflow.com/questions/59035322/a-question-about-cuda-free-memory-and-managed-memory