I cannot login to my DGX spark connected directly via kvm. I cannot SSH into it either. It just stays in the login screen. After logging in it just hangs on the login screen. This just started happening two nights ago while training some models. Please help!!!
It appears an OOM error occurred, causing the connection to fail due to insufficient memory. First, press the device’s power button to shut it down, then reboot it. Once logged in, immediately enter `docker ps` to check the running containers, then terminate the process.
You were right! I managed to SSH into the spark long enough to stop the running containers. Is there anyway to prevent this from happening in the future?