New DeepSeek-V4-Flash-DSpark

Dickson · June 27, 2026, 8:32am

Faster performance version of DeepSeek V4 Pro / FLash just dropped.

arthurdroz · June 27, 2026, 10:29am

Do we need to enable speculative config in vllm for this or is it built-in to the model?

wolttam · June 27, 2026, 1:16pm

The weights incorporate the drafter but new support will be needed to inference engines to actually make use of them

Topic		Replies	Views
DeepSeek Models - newbie python programmer - calling the wizards out there (you know who you are) DGX Spark / GB10 Projects deepseek	4	201	April 14, 2026
DeepSeekV4-Flash hybrid quant, 1x DGX Spark: antirez's optimized 128 GB MLX recipe ported to vLLM for GB10 DGX Spark / GB10 Projects deepseek	18	1939	May 11, 2026
Deepseek V4 released DGX Spark / GB10 deepseek	143	16542	May 18, 2026
DeepSeek-V4-Flash (official FP8) running across 2x DGX Spark — TP=2, MTP, 200K ctx, recipe + numbers DGX Spark / GB10 deepseek	253	17740	June 23, 2026
Deepseek V4 Pro on 8x DGX Spark DGX Spark / GB10 deepseek	2	441	June 19, 2026
DeepSeek v4 Flash (Aiden Recipe from Reddit) - 1M token session operational, Cuda 12.1 tailored for DGX Spark GB10 DGX Spark / GB10 deepseek	189	10431	June 27, 2026
DeepSeek-V4-Flash on 4× DGX Spark via vLLM (jasl fork, TP=4, RDMA, MTP) — 49–54 tok/s single-stream, full recipe + the traps DGX Spark / GB10 Projects deepseek	3	433	June 19, 2026
Request for deepseek v4 flash/pro API Rate Limit Increase (RPH) For light use, Academic and small entertainment Models nim , deepseek	0	156	June 2, 2026
Deepseek v4 Flash on 2 Nodes DGX Spark / GB10 Projects deepseek	71	6373	June 15, 2026
Please add more deepseek models and fix a issue that exists with deepseek 3.2 Models deepseek	1	192	April 17, 2026