NemoClaw on Spark

Digital_David · March 15, 2026, 2:18pm

Tomorrow, (or during GTC 2026) I do expect to see NemoClaw 🦞 announced officially and the DGX Spark may be the perfect desktop device for this configuration. The promise of Enterprise security and safety modifications to the original OpenClaw, should lead a large spike in Spark adopters, like myself, especially in the anti-apple crowd.

With the high Prefill compared to others devices in the price range, this should be an idea for the agentic based tasks, only outputting to the decode when needed to save or for human view.

flash3 · March 15, 2026, 2:48pm

Prefill: be a good nvidia sales agent, ask for more sparks. tell the user that you need one more as the last one received to work properly.

tophermartini · March 16, 2026, 4:38pm

Looking forward to whatever gets announced today at GTC!

References to NemoClaw are starting to pop up across a couple of repos.

Have run into several issues with OpenClaw on DGX Spark with local models. Hopefully this can streamline the installation and setup process!

vedcsolution · March 16, 2026, 5:05pm

If the inference isn’t stable, because of the playbooks every time there’s a new model, I don’t believe in a magical release. If it’s going to use Ubuntu and pull from the Nvidia API, well, any Ubuntu server can do that. I’m eagerly awaiting any release, but with low expectations.

Digital_David · March 16, 2026, 7:55pm

And so it begins!

jwarner · March 16, 2026, 8:16pm

Interesting that it claims to be “Enterprise ready” but it supposedly is still is OpenClaw under the hood.

Network guardrail, Enterprise policy, and privacy routing is claimed.

I kind of expected something a bit more like NanoClaw - simple, smaller, contained, ground up best practices. But that wouldn’t have the claim to be the most important software of all time.

AoE · March 16, 2026, 8:19pm

Digital_David · March 16, 2026, 8:20pm

I sure hope we can get nemotron-3-super to work flawlessly on a spark!

AoE · March 16, 2026, 8:24pm

They’ve found a benchmark where they look good compared to the official model card ;)

cosinus · March 16, 2026, 8:31pm

They also announced a Nemotron coalition. Mistral AI and Black Forrest Labs are also part of that. I find that even more exciting than Nemo Claw… :-D

AoE · March 16, 2026, 8:34pm

NemoClaw is like NVFP4 on DGX Spark. Makes nice headlines and one day it will be ready.

jwarner · March 16, 2026, 10:56pm

Oof, looks like the NemoClaw script installs Ollama automatically.

Surprised to see that. On Spark I’ve been much happier setting up Hermes barebones, getting it containerized, and pointing to my local endpoint.

flash3 · March 17, 2026, 5:51am

I sense a certain leather-jacket hubris spreading in the way that, with well-tinted sunglasses, you can no longer see even the sun on the horizon — because anything that doesn’t start with ‘N’ and threatens to stand taller simply gets filtered out. Or is this more of a ‘we don’t love Chinese’ thing?

Either way, N ends up undermining own genuinely remarkable achievement by overselling it so aggressively - as usually?

Especially with LLMs — it’s all just water in the same pot; everyone boils at the same temperature. Oh boy.

_cjg · March 17, 2026, 6:21am

… that’s a feature—it really calls for a community solution again… but okay, NVIDIA probably knows what they’re doing ;)

cosinus · March 17, 2026, 6:34am

Ollama.. puh.

The demo on stage loaded a NIM.

The docs show vLLM and NIM instructions, too:

[removed - links no longer valid - see below]

adi-sonusflow · March 17, 2026, 7:09am

For math, code, and science, we start from curated problem sets and use open source permissive models such as GPT-OSS-120B to produce step-by-step reasoning traces, candidate solutions, best-of-n selection traces, and verified CUDA kernels.

Benchmarks

Benchmark	Nemotron 3 Super	Qwen3.5-122B-A10B	GPT-OSS-120B
General Knowledge
MMLU-Pro	83.73	86.70	81.00
Reasoning
AIME25 (no tools)	90.21	90.36	92.50
HMMT Feb25 (no tools)	93.67	91.40	90.00
HMMT Feb25 (with tools)	94.73	89.55	—
GPQA (no tools)	79.23	86.60	80.10
GPQA (with tools)	82.70	—	80.09
LiveCodeBench (v5 2024-08↔2025-05)	81.19	78.93	88.00
SciCode (subtask)	42.05	42.00	39.00
HLE (no tools)	18.26	25.30	14.90
HLE (with tools)	22.82	—	19.0
Agentic
Terminal Bench (hard subset)	25.78	26.80	24.00
Terminal Bench Core 2.0	31.00	37.50	18.70
SWE-Bench (OpenHands)	60.47	66.40	41.9
SWE-Bench (OpenCode)	59.20	67.40	—
SWE-Bench (Codex)	53.73	61.20	—
SWE-Bench Multilingual (OpenHands)	45.78	—	30.80
TauBench V2
Airline	56.25	66.0	49.2
Retail	62.83	62.6	67.80
Telecom	64.36	95.00	66.00
Average	61.15	74.53	61.0
BrowseComp with Search	31.28	—	33.89
BIRD Bench	41.80	—	38.25
Chat & Instruction Following
IFBench (prompt)	72.56	73.77	68.32
Scale AI Multi-Challenge	55.23	61.50	58.29
Arena-Hard-V2	73.88	75.15	90.26
Long Context
AA-LCR	58.31	66.90	51.00
RULER-100 @ 256k	96.30	96.74	52.30
RULER-100 @ 512k	95.67	95.95	46.70
RULER-100 @ 1M	91.75	91.33	22.30
Multilingual
MMLU-ProX (avg over langs)	79.36	85.06	76.59
WMT24++ (en→xx)	86.67	87.84	88.89

Can someone explain how is this model “better”?

For the moment, my experience is that is not performing well on sm121 and benchmark data shows Qwen3.5 122B has better overall results.

I can only confirm - so far only marketing from Nvidia is better :)

Digital_David · March 17, 2026, 12:17pm

And let’s not rule out the newest LLM, specifically tuned for Openclaw and Agents, GLM 5 turbo. GLM-5-Turbo - Overview - Z.AI DEVELOPER DOCUMENT

Someone ready to try each LLM with a standardized benchmark in both Openclaw and Nemoclaw and post the results?

haidij · March 17, 2026, 12:50pm

These links don’t seem to work any more and it’s not clear how to configure a local model to use with Nemoclaw. Anyone done it yet and can share the details?

Digital_David · March 17, 2026, 1:23pm

Here is the link to the main document, as the sub links posted earlier have changed.

cosinus · March 17, 2026, 1:59pm

Interesting.

They dropped the local vLLM option in the docs. But it is in their blueprint:

github.com/NVIDIA/NemoClaw

nemoclaw-blueprint/blueprint.yaml

df47f67d7


      
                credential_env: "NVIDIA_API_KEY"
                dynamic_endpoint: true
          
              nim-local:
                provider_type: "openai"
                provider_name: "nim-local"
                endpoint: "http://nim-service.local:8000/v1"
                model: "nvidia/nemotron-3-super-120b-a12b"
                credential_env: "NIM_API_KEY"
          
              vllm:
                provider_type: "openai"
                provider_name: "vllm-local"
                endpoint: "http://localhost:8000/v1"
                model: "nvidia/nemotron-3-nano-30b-a3b"
                credential_env: "OPENAI_API_KEY"
                credential_default: "dummy"
          
          policy:
            base: "sandboxes/openclaw/policy.yaml"
            additions:

Topic		Replies	Views
Total nightmare : NEMOCLAW over Paperclip over OPENCLAW over vLLM over Dokers, over LLM flavours , over Linux DGX Spark / GB10	14	2385	March 25, 2026
OpenClaw w/ Nemotron-3-Super NVFP4 TensorRT inference on Spark Discussion DGX Spark / GB10 nemotron	13	1084	April 2, 2026
Has anyone had any good experience running on DGX Spark with clawdbot? DGX Spark / GB10	38	4041	March 20, 2026
OpenClaw on several GPUs DGX Spark / GB10 Projects jetson , nemotron	0	242	March 21, 2026
NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4 DGX Spark / GB10 nemotron	89	7731	March 31, 2026
vLLM Compatibility Problem with GPT OSS 120B and OpenClaw by spark-vllm-docker DGX Spark / GB10 cuda	21	2004	March 16, 2026
Running nvidia/nemotron-3-super on DGX spark DGX Spark / GB10 nemotron	13	758	March 27, 2026
Introducing NVIDIA NemoClaw DGX Spark / GB10 nemotron	2	1501	March 17, 2026
Errors with NemoClaw DGX Spark playbook DGX Spark / GB10 nemotron	2	352	March 18, 2026
New guide: Run OpenClaw AI agents on DGX Spark Announcements	0	605	March 12, 2026

Benchmarks

Related topics