I created a small replacement for the built-in DGX Dashboard because I wanted to be able to access it without having to tunnel a port, I wanted to see RAM usage without cache/buffers, and I wanted some additional stats (like temps and CPU usage).
However when testing it after a reboot, I noticed a big diffence in the memory reported by each well beyond just what is in caches. My dashboard showed around 3GB while the built-in one showed 13GB!
I’m getting my numbers from free and it seems like these don’t match what’s shown in the built-in dashboard. Here’s a screenshot:
The dashboard shows 13GB used, but the output of free shows 3.8Gi used, 2.0Gi buffers/cache.
While there is some difference with GB/Gi, these numbers still don’t seem to add up. My guess is that the dashboard could be doing 128GB minus 115Gi (“available”) to get 13, but if these are not the same units then I don’t think this is sound. It should do 119-115 to get 4Gi used (and convert that to GB if required)?
But 56076940/125513944 is 44.7%. The numbers in /proc/meminfo are in kiB (despite the confusing label) as can be inferred from MemTotal being 125x and not 128x. My suspicion was that perhaps MemFree is being treated as if it was kB resulting in the wrong amount of free memory, and the total is hard-coded as 128 (not also converted from MemTotal which would result in the correct % but the total shown as 125GB).
Can you elaborate on this? In what way would they not be accurate? I’m a bit of a noob and curious (on the surface, the numbers in this file seem reasonable for what I’m expecting).
This looks better, but still doesn’t seem correct. Here’s my output of free -h --si and free -h and you can see that the actual usage is 3.6GB/128GB or 3.3Gi/119Gi, but the dashboard is showing 3.3/128. It seems that it’s still mixing up the units (showing the used memory in Gibibytes but the total in Gigabytes).