Deepseek V4 Pro on 8x DGX Spark

pakasio · June 17, 2026, 2:12pm

I’m not that good with vLLM, but I want to understand if it’s possible to deploy DeepSeek-V4-Pro ( nvidia/DeepSeek-V4-Pro-NVFP4 ) across all 8 of my nodes?

coder543 · June 17, 2026, 2:18pm

I believe that model quant is only about 850 GiB total compared to the 976 GiB of usable memory that you have access to, so it should work fine? But I would recommend looking into GLM-5.2. By all appearances, it is a substantial step up from DSV4 Pro, and it would also fit in your 8xSpark cluster.

ash.x.kingsley · June 19, 2026, 4:35pm

@pakasio, yes, it’s possible, but getting it running currently requires using versions of software outside of the primary sources, which is discussed in some of the threads on DeepSeek-V4 here. An 8x cluster can run current top open-source models like: DeepSeek-V4-Pro, Kimi-K2.6, Kimi-K2.7-Code, MiniMax-M3, and GLM-5.2-FP8. These models are close to the current state-of-the-art closed-source models, but have the added benefit of never getting reduced to a lower-quality quant like the proprietary cloud models do when they’re under heavy load, so it’s a very nice capability for the price.

Topic		Replies	Views
Deepseek V4 released DGX Spark / GB10 deepseek	143	16081	May 18, 2026
DeepSeek-V4-Flash on 4× DGX Spark via vLLM (jasl fork, TP=4, RDMA, MTP) — 49–54 tok/s single-stream, full recipe + the traps DGX Spark / GB10 Projects deepseek	3	188	June 19, 2026
How to run NVFP4/DeepSeek-R1-0528-Qwen3-8B-FP4 using eugr/spark-vllm-docker DGX Spark / GB10 deepseek	9	488	March 16, 2026
DeepSeek-V4-Flash (official FP8) running across 2x DGX Spark — TP=2, MTP, 200K ctx, recipe + numbers DGX Spark / GB10 deepseek	250	15743	June 19, 2026
Deepseek v4 Flash on 2 Nodes DGX Spark / GB10 Projects deepseek	71	5918	June 15, 2026
Fully custom CUDA-native Deepseek 4 Flash optimized for 1x Spark! antirez/ds4 DGX Spark / GB10 Projects gaming , llama , deepseek	71	6308	June 19, 2026
DeepSeek V4 Flash: Bringing Frontier AI to the Home DGX Spark / GB10 deepseek	11	2828	May 17, 2026
Anyone having luck with Deepseek V4 Flash on Dual Sparks? DGX Spark / GB10 deepseek	13	1301	June 4, 2026
DeepSeekV4-Flash hybrid quant, 1x DGX Spark: antirez's optimized 128 GB MLX recipe ported to vLLM for GB10 DGX Spark / GB10 Projects deepseek	18	1854	May 11, 2026
DeepSeek v4 Flash (Aiden Recipe from Reddit) - 1M token session operational, Cuda 12.1 tailored for DGX Spark GB10 DGX Spark / GB10 deepseek	129	7095	June 18, 2026

Deepseek V4 Pro on 8x DGX Spark

Related topics