sparkrun. I also recommend it for single nodes (not just multi-node clusters).
The latest version also includes a CLI setup wizard to try to help you do setup for networking, SSH config, etc. The wizard is new, so feedback is appreciated, but hopefully it would help you to do the configuration as well.
Install uv if you don’t have it already: curl -LsSf https://astral.sh/uv/install.sh | sh
Install sparkrun and start the wizard: uvx sparkrun setup
You’ll probably want to accept a lot of the defaults / say yes a lot – but you’ll have to give it the IP of your first node and the other node when it asks, e.g.: 127.0.0.1,192.168.44.21 where 127.0.0.1 means current system and 192.168.44.21is the ethernet IP of your new 2nd spark. It may ask you to type in the passwords as part of setup process, it doesn’t save them. (Example IPs written assuming you’re operating via Spark#1).
Then you can use existing “recipes” for sglang models from the preconfigured registries or make your own.
drew@spark-2918:~$ sparkrun list sglang
Name Runtime TP Nodes GPU Mem Model Registry
--------------------------------------------------------------------------------------------------------------------
qwen3-1.7b-sglang sglang 1 1 0.3 Qwen/Qwen3-1.7B sparkrun-transitional
qwen3-coder-next-fp8-sglang sglang 2 2 0.8 Qwen/Qwen3-Coder-Next-FP8 sparkrun-transitional
qwen3.5-0.8b-bf16-sglang sglang 1 1 0.8 Qwen/Qwen3.5-0.8B sparkrun-transitional
qwen3.5-122b-a10b-fp8-sglang sglang 2 2 0.8 Qwen/Qwen3.5-122B-A10B-FP8 sparkrun-transitional
qwen3.5-27b-fp8-sglang sglang 1 1 0.8 Qwen/Qwen3.5-27B-FP8 sparkrun-transitional
qwen3.5-2b-bf16-sglang sglang 1 1 0.8 Qwen/Qwen3.5-2B sparkrun-transitional
qwen3.5-35b-a3b-bf16-sglang sglang 1 1 0.8 Qwen/Qwen3.5-35B-A3B sparkrun-transitional
qwen3.5-35b-a3b-fp8-sglang sglang 1 1 0.8 Qwen/Qwen3.5-35B-A3B-FP8 sparkrun-transitional
qwen3.5-4b-bf16-sglang sglang 1 1 0.8 Qwen/Qwen3.5-4B sparkrun-transitional
qwen3.5-9b-bf16-sglang sglang 1 1 0.8 Qwen/Qwen3.5-9B sparkrun-transitional
The registries are all publicly available git repos. Recently, I’ve been working with @eugr and @raphael.amorim on Spark Arena, and since @eugr is the king of DGX Spark vLLM, I’ve been rather vLLM focused lately (i.e. in the past 1-2 weeks), but I do plan to come back to sglang containers and recipes.
You can run an existing recipe easily enough:
Run it with default settings: sparkrun run qwen3.5-35b-a3b-fp8-sglang
Override with tensor parallelism over nodes and reduce gpu memory utilization sparkrun run qwen3.5-35b-a3b-fp8-sglang --tp 2 --gpu-mem 0.5 – which should give you a nice speed boost leveraging both nodes (and I reduced the target memory utilization in this example to leave some more RAM open for other things)
You can view the recipe text with:
sparkrun export recipe qwen3.5-35b-a3b-fp8-sglang
–or–
save it a file with: sparkrun export recipe qwen3.5-35b-a3b-fp8-sglang --save my-recipe.yaml
Then you can edit the defaults to your preferences, save it, and run
sparkrun run ./my-recipe.yaml and it’ll not require you to override settings at CLI.
When you make your own recipes, you can also change the model, container basis, etc., so you can pretty much automate running whatever you want to run. Then you could publish your recipes to registries to manage them via git or to share them with others.
You could even install a model as a system service with sparkrun export systemd.
You also can run sparkrun on a linux/mac/(Windows via WSL) remote machine (not one of your sparks) and it can be used to manage/orchestrate your sparks.
There are fairly complete docs on the website (https://sparkrun.dev) so you can look stuff up there or chat on the forums about it: Sparkrun - central command with tab completion for launching inference on Spark Clusters - #48 by dbsci
Happy Sparking!
P.S. >> I forgot to mention there is also a claude code plugin. So once you’re setup, you can use the claude code plugin to check/start/stop inference jobs via claude code. More AI automation will be coming soon.