Open-source recipe + scaffold: training a DSpark-class speculative-decoding draft for Nemotron

rneble · June 29, 2026, 10:23am

DeepSeek’s new DSpark / DeepSpec trainer ships with target support for Qwen3 and Gemma only. A lot of us here run Nemotron on Spark, so I built the missing piece and open-sourced it.

Disclosure up front: I’m Rhyan Neble, founder of Extended Systems Intelligence (XSI). We do most of our Nemotron work on DGX Spark, and this came out of it.

Repo (Apache-2.0): GitHub - Extended-Systems-Intelligence/nemotron-dspark-recipe: Community recipe + reference scaffold for training a DSpark-class speculative-decoding draft model for NVIDIA Nemotron, extending DeepSeek's DeepSpec. · GitHub

What it gives you:

The four DeepSpec extension points wired for a Nemotron target — a chat template, a draft-config builder that maps Nemotron’s transformer dimensions into the draft, a NemotronDSparkTrainer, and a worked config.
A step-by-step recipe: data prep → target cache → draft training → eval → serving, with notes for NVIDIA cloud GPUs.
Field notes on the Nemotron-H hybrid Mamba-Transformer checkpoints — where the stock Hugging Face generate/cache path gets in the way of hidden-state extraction during the cache stage, and the use_cache=False + output_hidden_states path that works. That’s the part that cost us time; it’s written down so it doesn’t cost you any.
A no-GPU selftest.py to catch integration breakage before a long run.

What it’s not: a benchmarked checkpoint. The scaffold is written against DeepSpec’s real interfaces but hasn’t been trained end-to-end and tuned yet — it’s a starting point. If you train a draft with it on Spark, I’d like to see your accepted-length numbers and target_layer_ids — open an issue or reply here.

alexander.kachur · June 29, 2026, 2:59pm

Hi Rhyan,

Thanks for sharing

Have you tried to use it? What inference engine are you using and is there any benefits vs EAGLE drafter?

rneble · June 29, 2026, 3:49pm

I’m currently training on my Sparks. I’m hoping to post my own numbers later this week. Once I get it trained up, I will share it along with any field notes or changes to the scaffolding.

rneble · June 29, 2026, 4:00pm

DeepSeek reports DSpark accepting +26.7–30.9% more tokens than EAGLE-3 — but that’s their number on Qwen3, and it’s the accepted length, not end-to-end tok/s. Promising on paper; needs independent benchmarking. Our recipe isn’t a bet against EAGLE. DeepSpec — the framework underneath — ships EAGLE-3 and DSpark, and the Nemotron extension is the same four touchpoints either way. So the recipe can train an EAGLE-3 draft for Nemotron too (we did DSpark first). I’ll add a NemotronEagle3Trainer to the repo if you like; just let me know if it would help. Either way, I will train and benchmark both as soon as I can, so stay tuned.

Topic		Replies	Views
New DeepSeek-V4-Flash-DSpark DGX Spark / GB10 deepseek	5	3820	June 29, 2026
Spark-inference: Run 3 specialized models simultaneously on your DGX Spark — cybersecurity + coding + orchestration, 30-min setup DGX Spark / GB10 Projects jetson , llama , deepseek , nemotron	3	1265	May 11, 2026
DGX Spark Playbooks Update - Jan 2026 Announcements data-science , spark , jetson , generative_ai , nemotron	0	1060	January 21, 2026
nvidia/Nemotron-Cascade-2-30B-A3B yet another model to test DGX Spark / GB10 nemotron	19	1630	March 24, 2026
OpenClaw w/ Nemotron-3-Super NVFP4 TensorRT inference on Spark Discussion DGX Spark / GB10 nemotron	14	1554	April 9, 2026
DeepSeek-V4-Flash-DSpark on 2× DGX Spark (GB10) — big single-stream speed boost (~60-67 tok/s) + 1M context, now with concurrency DGX Spark / GB10 deepseek	76	2365	July 1, 2026
NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4 DGX Spark / GB10 nemotron	89	10284	March 31, 2026
ASR on Spark with nemotron-speech-streaming-en-0.6b DGX Spark / GB10 Projects nim , nemotron	5	682	February 28, 2026
DGX Spark, Nemotron3, and NVFP4: Getting to 65+ tps DGX Spark / GB10 spark , nemotron , dgx	14	2248	December 22, 2025
nvFP4 training - Playbook request DGX Spark / GB10	12	542	March 16, 2026

Open-source recipe + scaffold: training a DSpark-class speculative-decoding draft for Nemotron

Related topics