Dgx spark can only use 119GB of memory running ComfyUI, so it is killed, isn't there 128GB of memory, why can't it be used?

KILLED,WHY

The amount of memory used is dependent on workload and other environment settings that can limit how much a process uses. What error messages are you seeing that say it is an out of memory error?

I believe the dashboard is mixing up Gigabytes and Gibibytes, and it’s actually using 128GB when it shows 119GB (119GiB is 128GB).

I’ve mentioned this before at What is the RAM value shown on the DGX Dashboard? It doesn't seem to match `free` - #8 by DannyTup

A fix was made, but it unfortunately it didn’t solve the problem, it just changed the way in which it is wrong.

The easiest way to verify this is to run free -h --si and compare the numbers to the dashboard. They won’t match, but if you run free -h they will, but those figures are in Gi.

I made my own dashboard for some other reasons (like not needing to tunnel, and easy ability to stop/start/monitor resource usage of docker containers) and I believe mine shows the memory usage correctly, and it is always above the built-in dashboard by the amount you’d expect if their number was GiB.

2 Likes

I had the same issue when I loaded two models in parallel and played around with the context windows. If your assumption is correct that GiB was confused with GB here, and you (?) already pointed this out some time ago, then NVIDIA, are you okay?