Hello everyone!
I’ve made a nice runner for DGX Spark and RTX graphic cards that uses vllm backend.
All descriptions are on GitHub. Feel free to contribute.
tpnthr
1
Related topics
| Topic | Replies | Views | Activity | |
|---|---|---|---|---|
| "vLLM + Gemma 4 on NVIDIA DGX Spark GB10" - has anyone testing this implementation? | 1 | 588 | April 29, 2026 | |
| Spark: one script CLI for setup, remote access, and LLM serving on DGX Spark | 3 | 419 | May 21, 2026 | |
| Some new development work for Qwen3 on the Spark | 5 | 836 | February 3, 2026 | |
| Can someone please just help me set the DGX Spark up for optimal LLM use? | 11 | 985 | June 20, 2026 | |
| I'd like to learn how to use the latest vLLM on DGX Spark | 9 | 2423 | November 29, 2025 | |
| Can I use Ollama or vLLM on the GB10 to run multiple LLM models simultaneously | 7 | 1104 | November 29, 2025 | |
| Issue with connection to 2 dgx sparks. vllm | 1 | 219 | November 30, 2025 | |
| HOW-TO: setup-dgx-spark docker inference - A "Sane" Inference Stack for GB10 (Need Contributors!) | 39 | 2671 | June 21, 2026 | |
| Managing Local LLM Orchestration | 12 | 2741 | April 23, 2026 | |
| Spark-inference: Run 3 specialized models simultaneously on your DGX Spark — cybersecurity + coding + orchestration, 30-min setup | 3 | 1222 | May 11, 2026 |