Hello everyone!
I’ve made a nice runner for DGX Spark and RTX graphic cards that uses vllm backend.
All descriptions are on GitHub. Feel free to contribute.
tpnthr
1
Related topics
| Topic | Replies | Views | Activity | |
|---|---|---|---|---|
| "vLLM + Gemma 4 on NVIDIA DGX Spark GB10" - has anyone testing this implementation? | 1 | 543 | April 29, 2026 | |
| Spark: one script CLI for setup, remote access, and LLM serving on DGX Spark | 3 | 334 | May 21, 2026 | |
| Some new development work for Qwen3 on the Spark | 5 | 821 | February 3, 2026 | |
| I'd like to learn how to use the latest vLLM on DGX Spark | 9 | 2359 | November 29, 2025 | |
| Can I use Ollama or vLLM on the GB10 to run multiple LLM models simultaneously | 8 | 1045 | December 13, 2025 | |
| Issue with connection to 2 dgx sparks. vllm | 4 | 215 | November 30, 2025 | |
| HOW-TO: setup-dgx-spark docker inference - A "Sane" Inference Stack for GB10 (Need Contributors!) | 38 | 2423 | April 28, 2026 | |
| Managing Local LLM Orchestration | 12 | 2407 | April 23, 2026 | |
| Spark-inference: Run 3 specialized models simultaneously on your DGX Spark — cybersecurity + coding + orchestration, 30-min setup | 3 | 1071 | May 11, 2026 | |
| New pre-built vLLM Docker Images for NVIDIA DGX Spark | 74 | 8684 | March 27, 2026 |