|
MiniMax M3 NVFP4 and NVFP4 REAP 50 for 4x & 2x DGX Sparks
|
|
51
|
2878
|
June 30, 2026
|
|
[GUIDE] DeepSeek-V4-Flash on 2× DGX Spark (GB10) — Reproducible vLLM Serving Recipe up to 1M Token Context
|
|
2
|
323
|
June 30, 2026
|
|
Asus GX10 Stable: Hermes Twin Qwen3.6-35A-A3B + Qwen3.6-27B + ComfyUI
|
|
8
|
1183
|
June 29, 2026
|
|
Introducing the Atlas Inference Server and Engine
|
|
162
|
9668
|
June 29, 2026
|
|
Open-source recipe + scaffold: training a DSpark-class speculative-decoding draft for Nemotron
|
|
3
|
144
|
June 29, 2026
|
|
Ornith 1.0 Anyone?
|
|
6
|
2053
|
June 29, 2026
|
|
2 node spark vs 3 or 4 node spark
|
|
18
|
1023
|
June 29, 2026
|
|
New model (one of the best i ever used, in my humble opinion): Qwen Agent World for local programming
|
|
0
|
243
|
June 29, 2026
|
|
DFlash for Qwen3.5-122B-A10B = 80+ tok/s on 1x Spark!
|
|
28
|
2773
|
June 29, 2026
|
|
Fully custom CUDA-native Deepseek 4 Flash optimized for 1x Spark! antirez/ds4
|
|
77
|
7635
|
June 28, 2026
|
|
Btop for DGX Spark
|
|
3
|
1003
|
June 28, 2026
|
|
Eigr's gold by Leathery Tendons
|
|
0
|
109
|
June 28, 2026
|
|
vLLM for Inference with 2 sparks example - WARNING 06-28 14:18:56 [ray_utils.py:556] Tensor parallel size (2) exceeds available GPUs (1)
|
|
0
|
42
|
June 28, 2026
|
|
MiMo V2.5 Omni on 3x DGX Spark: TP=3 + MTP + 1M context 39 tok/s
|
|
3
|
275
|
June 28, 2026
|
|
GLM-5.2 with TP=5: hypothetically possible?
|
|
0
|
131
|
June 27, 2026
|
|
Introducing Tool Eval Bench CLI
|
|
166
|
5919
|
June 26, 2026
|
|
Three times ( VoiceClone | VoiceDesign | CustomVoice ) - Faster-Qwen3-TTS for NVIDIA DGX Spark (GB10)
|
|
54
|
1942
|
June 26, 2026
|
|
GB10 really does hit ~1 PFLOP NVFP4 (2:4 sparse) — measured, with an open-source tool to reproduce it
|
|
22
|
1208
|
June 25, 2026
|
|
GLM-5.2 IQ4_XS on 4× GB10 — 6.28 tok/s, DSA active, full recipe
|
|
8
|
2270
|
June 25, 2026
|
|
Flux.2 Klein 9B on DGX Spark: 2.5x Faster Inference and 59% Lower VRAM with Vitoom Nunchaku
|
|
0
|
139
|
June 25, 2026
|
|
3 dgx spark cluster and sparkrun problem
|
|
0
|
99
|
June 24, 2026
|
|
Running GLM-4.7-FP8 (355B MoE) on 4x DGX Spark with SGLang + EAGLE Speculative Decoding
|
|
38
|
2374
|
June 24, 2026
|
|
DeepSeek v4 Flash (IQ2XXS) on a single GB10!
|
|
12
|
3941
|
June 24, 2026
|
|
Atlas: Open-source inference engine for DGX Spark <2minute cold start, 100+ tok/s on Qwen3.6-35B-FP8, 13+ supported models
|
|
100
|
5619
|
June 24, 2026
|
|
Tauergon agent harness optimized for local llm
|
|
0
|
106
|
June 23, 2026
|
|
Step-3.7-Flash on single Spark (llama.cpp only)
|
|
17
|
1684
|
June 23, 2026
|
|
Building Local + Hybrid LLMs on DGX Spark That Outperform Top Cloud Models
|
|
25
|
7046
|
June 23, 2026
|
|
Commercially Available Rack/Cooling for 1+N Sparks?
|
|
5
|
322
|
June 22, 2026
|
|
Beta of Steam snap for arm64
|
|
1
|
776
|
June 22, 2026
|
|
HOW-TO: setup-dgx-spark docker inference - A "Sane" Inference Stack for GB10 (Need Contributors!)
|
|
39
|
2713
|
June 21, 2026
|