First of all, hello to every one, I’m Danny, new to the forum and excited to be here! I’m thinking of purchasing 2 DGX Sparks to run in a cluster… Primary use case is running my Open Claw build’s brain and vibe coding web and mobile apps. My thought process is, host the best LLM I can possibly ru…

We’ve been sharing our benchmarks here: https://spark-arena.com Favorites for dual not have been MiniMax-M2.5 with 229B and QuantTrio/Qwen3-VL-235B-A22B-Instruct-AWQ. You can also run openai/gpt-oss-120b, Qwen/Qwen3.5-122B-A10B and Qwen/Qwen3-Coder-Next-FP8 at really good performance on a cluster…

As for the prices: [image] 2/23/2026 Price Change Announcement Announcements We have adjusted the MSRP of DGX Spark (Founders Edition) due to worldwide constraints in memory supply. DGX Spark continues to offer industry-leading AI performance in an ultra compa…

Try Step 3.5 Flash. 196B parameters, 10b active, highly efficient attention mechanism. There are two quants which work great on the Spark - the official Q4_K_S and the IQ4_XS. The official one is stable limited to ~190k context, getting a bit over 20 t/s decreasing to about 8 near context limit. Th…

What's the biggest LLM you've been able to run on a Cluster of DGX Sparks with a large context window (128k and up)?

Accelerated Computing DGX Spark / GB10 User Forum DGX Spark / GB10

cosinus February 26, 2026, 4:06pm 3

As for the prices:

As for the “biggest LLM” - may be you check out first what you might get when running different LLMs over here:

using the famous eugr vllm tools (makes running them in a cluster much easier).

And if you want to go for some more speed (not always) have a look over here:

llama.cpp is handy for single spark use. For agentic use vLLM should be better.

And if you have too much money or a lot of YouTube subscribers:

AFAIR Alex did run Kimi K2 and Qwen3.5 397B - just need 8 Sparks.

2 Likes

Topic		Replies	Views
Distributed Inference - 200gb/s with bottleneck, am I missing something? DGX Spark / GB10 llama	5	474	January 22, 2026
Anyone have any luck running MiniMaxAi/MiniMax-M2 on a cluster DGX Spark? DGX Spark / GB10	9	1226	December 14, 2025
My Dual Sparks setup plan DGX Spark / GB10 agentic-ai , nemoclaw , openclaw	8	284	April 8, 2026
100b+ parameter LLM list DGX Spark / GB10 llm , llama	5	910	January 24, 2026
Distributed Spark DGX Spark / GB10 llama	2	130	March 10, 2026
Best practices for running llvm bench DGX Spark / GB10	2	149	December 21, 2025
DGX Spark performance DGX Spark / GB10	50	3945	February 27, 2026
6x Spark setup DGX Spark / GB10	111	7841	April 12, 2026
Reviews are coming in DGX Spark / GB10	27	6899	November 24, 2025
Run Qwen3.5-27B with spark-vllm-docker DGX Spark / GB10 llama	1	1608	March 5, 2026

What's the biggest LLM you've been able to run on a Cluster of DGX Sparks with a large context window (128k and up)?

Related topics