I haven’t found any documentation on how to interpret what nvidia-smi
reports regarding Memory Usage for processes that use Unified Memory (cudaMallocManaged()
).
I understand that the interpretation for the Normal Allocation case (cudaMalloc()
) is well-defined and documented.
I will attempt to analyze what my understanding regarding the Unified Memory case is.
Take this screenshot as an example:
Let the top-middle value, which in the format {Used} / Total MiB
, be called Overall Memory Usage
.
Let the bottom-right value, which is in the format {Used} MiB
, be called GPU Memory Usage
.
My understanding is that Overall Memory Usage
(the top value) represents the Total GPU Memory that is occupied by data from any process (CUDA context) and there is no way to know what amount of data is owned by which process (context).
GPU Memory Usage
(the bottom value, that is per-process) represents the size of the CUDA context (in the range of a few 100s of MiBs). The size of the Unified Memory allocations made by this process is not accounted for.
So, my interpretation of the screenshot is as follows:
From the output of nvidia-smi
, we can tell that the process with PID 13987
has ~5003 MiB (5900 - 897) (we disregard the internally reserved memory size, which is in the range of ~10s of MiB) of data resident on the GPU (either fetched manually (prefetched) or via Page Faults). We can’t tell what the total amount of Unified Memory allocated by the process is.
I am also interested in the case where more than 1 applications are executing, as in the screenshot below:
In this case, my interpretation is as follows:
We can tell that the two processes with PIDs 14229
, 21047
have a total of sum of ~9867 MiB (11661 - 2*897) of data resident on the GPU (either fetched manually (prefetched) or via Page Faults). We can’t tell how much data belongs to each process. Additionally, we cannot tell what the total amount of Unified Memory allocated by any of the processes is.
If possible, I would like an authoritative answer on this matter from NVIDIA.