NVIDIA Developer Forums

Vllama - ollama-like runner for DGX and other Blackwell GPUs

Accelerated Computing DGX Spark / GB10 User Forum DGX Spark / GB10 Projects

tpnthr May 16, 2026, 11:37am 1

Hello everyone!
I’ve made a nice runner for DGX Spark and RTX graphic cards that uses vllm backend.
All descriptions are on GitHub. Feel free to contribute.

Topic		Replies	Views	Activity
"vLLM + Gemma 4 on NVIDIA DGX Spark GB10" - has anyone testing this implementation? DGX Spark / GB10	1	588	April 29, 2026
Spark: one script CLI for setup, remote access, and LLM serving on DGX Spark DGX Spark / GB10 Projects cuda , docker , spark , llm , deepseek	3	419	May 21, 2026
Some new development work for Qwen3 on the Spark DGX Spark / GB10	5	836	February 3, 2026
Can someone please just help me set the DGX Spark up for optimal LLM use? DGX Spark / GB10 llama	11	985	June 20, 2026
I'd like to learn how to use the latest vLLM on DGX Spark DGX Spark / GB10 cuda	9	2423	November 29, 2025
Can I use Ollama or vLLM on the GB10 to run multiple LLM models simultaneously DGX Spark / GB10	7	1104	November 29, 2025
Issue with connection to 2 dgx sparks. vllm DGX Spark / GB10	1	219	November 30, 2025
HOW-TO: setup-dgx-spark docker inference - A "Sane" Inference Stack for GB10 (Need Contributors!) DGX Spark / GB10 Projects docker , llama , dgx	39	2671	June 21, 2026
Managing Local LLM Orchestration DGX Spark / GB10 Projects	12	2741	April 23, 2026
Spark-inference: Run 3 specialized models simultaneously on your DGX Spark — cybersecurity + coding + orchestration, 30-min setup DGX Spark / GB10 Projects jetson , llama , deepseek , nemotron	3	1222	May 11, 2026