Not sure if the correct sub-forum, but this would be a handy to have tile in the dashboard, especially when running more ML focused workflows.
Any ideas if we can add it?
Not sure if the correct sub-forum, but this would be a handy to have tile in the dashboard, especially when running more ML focused workflows.
Any ideas if we can add it?
I will suggest this to our engineering team. In the meantime you can use programs like top to measure CPU util
@cart-bandit You can use this dashboard developed by the DGX Spark Community GitHub - DanTup/dgx_dashboard: A simple dashboard for the DGX Spark.
until official support is developed by Engineering team.
dgx_dashboard runs in a container. There are many terminal apps to display system info in the shell.
Here’s how btop looks on Spark:
btop offers a lot of customization. The above is the default.
Yep, I run everything in a container to keep things isolated and easy to set back up if I have to reinstall anything. It also allows you to control what it can access more easily (for example you can choose whether to expose the docker socket to it for the ability to monitor and start/stop containers).
There shouldn’t be any reason you couldn’t run it outside the container, but since I didn’t build it for that you might need to manually run the commands to compile it etc. (which will require a Dart SDK).
If you’re already in a terminal it’s perhaps not the best option. I built it because I just wanted something similar to the original dashboard that I could access in my browser across the network without needing to open an SSH tunnel, and with some additional stats (CPU, temps, accurate memory stats - the built-in dashboard is still wrong), and the ability to start/stop/monitor docker containers (since I run everything inside containers) 🙂
Using dart:stable-sdk container and the dart compile exe bin/main.dart -o bin/dgx_dashboard command from Dockerfile the app build successfuly.
After copying the binary and web assets to the host, the dashboard lauches and can be accessed remotely:
Thank you, @DannyTup !
This could be packaged in a .deb file and made available to the community. Unpacked the dgx_dashboard folder is just 6.9M
This isn’t something I’ve done before, but if you want to file an issue at https://github.com/DanTup/dgx_dashboard I can try to take a look some time (or a PR is welcome!).
I’m not sure if GitHub Packages supports .deb like it does for containers, so it might need to be published differently to how the Docker image is.
FWIW I have been meaning to look at the container to see if I can use a smaller base image. I did change it so the Dart image is only used for the compilation (and is not the base for the container at runtime), but I haven’t got as far as looking for the smallest base it can run with yet (and again, PRs are welcome if others have ideas about that before I get to it).
I haven’t had chance to look at .deb, but FWIW I made some changes today to significantly reduce the size of the container. It’s now around 45MB (instead of 500MB). It uses around 3-4MiB RAM at idle when no clients are connected, and around 26MiB RAM when a client is connected (nvidia-smi is run only while clients are connected).
If you’re running a previous version, you need to stop and remove the container and re-run to pull the latest version:
github.com/DanTup/dgx_dashboard?tab=readme-ov-file#updating-the-container