I am EXTREMely disappointed with the current state of DGX Spark

Dear NVIDIA,

I am extremely disappointed with the current state of DGX Spark.

This product was presented as a premium Blackwell platform with FP4, NVFP4, strong local AI capabilities, and a mature software experience. In reality, the experience has felt far less polished and far less complete than that message suggested.

The biggest example is NVFP4. It was clearly promoted as one of the major advantages of the platform, yet for too long the real-world experience on DGX Spark has felt immature, inconsistent, and overly dependent on workarounds. Instead of feeling like a reliable flagship capability, it has often felt like something users had to struggle to unlock on their own.

Compatibility has been another major frustration. Too many workflows still turn into time-consuming troubleshooting sessions involving unsupported paths, architecture-related issues, custom builds, patches, and repeated trial and error. For a premium NVIDIA system, that is simply not the level of usability many customers expected.

To be fair, DGX Spark does have real strengths. The hardware is promising, prefill can be strong, and some models and quantization formats perform reasonably well. That is exactly why this situation is so disappointing: the hardware shows clear potential, but the software experience still too often feels incomplete, fragile, and dependent on patience rather than proper support.

From a user perspective, this creates a very bad impression. It feels like key capabilities were promoted before they were fully ready in practice on DGX Spark. Too much of the experience has felt like wasted time, endless hoops to jump through, and constant workaround hunting instead of a premium out-of-the-box experience.

At this point, the issue is not just performance. It is trust.

Users need clear and honest communication about what is truly mature, what still has limitations, and what remains a work in progress. More than anything, DGX Spark needs software support that actually matches the expectations created around it.

Right now, the gap between the promise and the real experience is too large, and that is why so many users are frustrated.

Sincerely!!!

plus one!

My post from yesterday:

Openclaw / Neomtron-3-Super is why I purchased the DGX Spark, but the 3 minute initial load time, 1-3 minute response times, 15 t/s and the crashes every few hours, is infuriating at this point.

I do hope the team working on this see this as unacceptable, as it is a Far cry from the advertised 1 TFLOP & 100 tokens/sec NVFP4 performance … @aniculescu

Here let me fix that for you NVIDIA. we Think it can do 1 TFLOP & 100 tokens/sec NVFP4 performance, but you have to figure that out for yourself, and if you do let us know, we’ll happily take the credit for it.

I think the guys that have stuck around @eugr and Earned the Spark Expert badge, should be given full access and a stack of Sparks as they are essentially doing NVIDIA’s job for them :-(

ok, now back to my coffee.

Couldn’t agree more. Have been working for more than a week to get some semblance of a working model pipeline running on my 4x GB10 cluster to no avail. This is the equivalent of ordering a car on Carvana and it showing up in a bunch of pieces you have to assemble with no instructions.

I spent around $18K for 4x Asus Ascent GX10s plus a Mikrotik switch thinking all of the Nvidia “ecosystem” stuff was real and actually would work. Little did I know…

But seriously, huge thanks to the community dev leaders for all their work to make an actually operation ecosystem.

Booo NVIDIA - shame.

Add my name to the list, too. If it weren’t for a few stand-out community contributors including eugr, dbsci and others, the Spark would be a practically dead platform right now. The NVFP4 situation in particular pisses me off.

Right now I very disappointed in all the ConnectX-7 issues I’m having and I see other people having. Something is definitely off with the maturity of that portion of the unit. Was very happy with it in single node mode but adding additional nodes yeah pain in the butt

Nvidia clearly capped a lot of features, Connect x7 weird compatibility for NO REASON , sm121 wired solution with no real compatibility with sm100 architecture. obviously bad, low performance, strange CPU, GPU voltages and behavior, strange shutdowns, lack of real kernell support, not even a real fix for it!!!

nvidia knows that they delivered this child faulty, but they wont admit it. The plataform concept was right, the support in theory was superb, but the execution…. was not!!! is far behind, what a normal user experience should be.

Today i had to format my DGX for the third time, i have installed so many patches so much solutions, so much fixes that i have gained 10 tok/s more on each LLM loaded. How this happens? so much bugs, so much unsuportiveness.

The best of all this pandemonium is this community!!! realy lovely forum.

Same feeling. My friend bought one after seeing my DGX Spark running openclaw. Now he regrets it, it’s so slow, why not just use APIs.

I gave up on nemotron and went to qwen 3.5 version which someone here posted a version which can get 50toks

I want to keep nemotron but it’s so damn fat and nemoclaw with it is impossible…

Hello everyone,

I’m not necessarily as critical as some people here. This is a young platform — it’s new. ARM64 processors running Linux are still relatively uncommon outside of things like Raspberry Pi (we don’t even have Chrome ARM64 support yet).

Yes, there are real shortcomings on the software side, design issues, and even some hardware disappointments (we were expecting “datacenter-grade” Blackwell, and instead we get more consumer-oriented instructions, limited NVFP4 support, and questionable thermal management).

Yes, inference performance can be a bit disappointing… but from the beginning, it was clear that this platform was not designed for pure inference workloads. If that’s what you’re looking for, you’re probably better off buying a Mac Studio or waiting for the M5.

What I find frustrating is that a company like NVIDIA, valued at many billions of dollars, is not investing more into this platform. It may be niche, but it has a huge amount of untapped potential.

I’m willing to bet we’ll see significant improvements on the Spark by the end of the year — especially in the current hardware-constrained environment, where software optimization will be pushed forward rapidly.

Personally, I’m generating over 1 million tokens per day per Spark using Qwen 3.5 35B-3A, which is already more than sufficient for my needs, (that model is fantastic i not seeing any reason to use nemotron…) I’ve reached my goals — but yes, it could definitely be faster.

When I look at the growing community around Spark, and projects like vLLM that are actively supporting it, I only see positive signals. I believe most of the current issues will be resolved relatively quickly.

However, if someone from NVIDIA is reading this: please give us more control over thermal management, or at least provide more aggressive default profiles.

Nvidia CEO Jensen Huang has heavily praised OpenClaw, calling it “probably the single most important release of software ever”. Describing it as the “operating system of the AI agent era” and the Spark is the Near Perfect solution for this for home and small business use.

A Trillion $$ company should recognize this and allocate a dedicated team working on Spark and the next Spark II full time to take the lead in this sector alone. Make it the default choice for the “Claw” ear, your CEO wants this!

If not, your missing a Massive opportunity to sell millions of these and grow the community…

@aniculescu @johnny_nv

Let’s be honest — this is pure marketing narrative.

Calling OpenClaw “the most important software ever” or “the OS of the agent era” is completely disconnected from the current state of the ecosystem. Nothing today is even close to being an “OS” for agents — we’re still dealing with fragmented tooling, unstable abstractions, and rapidly evolving standards.

This looks far more like a strategic push from NVIDIA to position its own stack — especially Nemotron — than an objective assessment of the field.

And that’s where the argument really falls apart: Nemotron is simply not leading.
If you look at real-world usage and multiple benchmarks, models like Qwen are consistently ahead in efficiency, flexibility, and overall performance.

Now, about Jensen — he’s obviously an exceptional salesman, no debate there.
But honestly, I sometimes wonder if he has ever actually used a Spark in real-world conditions.

Because if he had, I’m pretty sure there would already be a dedicated team of 20 engineers working full-time on fixing the current gaps.

At this point, I’d genuinely be curious to hear a simple answer:
has he personally used a Spark beyond a demo environment?

I am so with you. I spent a WHOLE lot of time trying to get the NVPF4 to work.

even with NIM , sm_121a compute capability is not supported by standard NIM images.

I MAY have been able to move one step further, and you can check here

I did tell @eugr about this, but since I am not familiar enough, I did not do a PR.

but, I have been wondering if I would ahem and a better time had I choses a Mac mini/Ultra

If it’s about some FP8 kernels not being included on sm121 (with “sm120 only” message), this has been fixed in the latest releases - the corresponding PR is included in the spark-vllm-docker builds.

thank you sir! downloading and compiling now!

@eugr Just an idea. There are a lot of people who do not understand the current state of the things and how the software is evolving. Would you consider to have some kind of roadmap for the Spark software support so we can see where we are?

I wish I could have one too! Also, Spark software support is an extremely broad category.

I don’t have enough visibility into other projects (other than what’s available publicly), but I know that a lot of effort is being put into better NVFP4 support on consumer Blackwell in general, and Spark specifically.

As for our community projects (like spark-vllm-docker, llama-benchy, Spark Arena and sparkrun), we’ll continue our efforts in making it easier for people to run the latest and greatest models on Spark.

I just hope Nvidia can fix drivers/software issues first beside releasing new models
IDK if 99% of users are only using it for LLM.. (I feel nvidia keeps releasing new models so more people will buy their hardware..)

Currently I’m still having the following problems..

  • Using too much RAM will cause system freeze
    e.g: ~110G usage when I set gpu_memory_utilization for 0.7; if I set it to 0.8 system will freeze until I force to reboot it
  • UVC capture device won’t work with USB 3.0 and will cause the XHCI controller crashs

Didn’t need to mention (there are already lots of post about these..)
Cuda drivers updates/ 1 petaFlops (or performance related issue)/ NVFP4 supports
and the tcgen05

Also, after reading posts these months.. what I saw are mostly:

  • Someone developed some tools or post tutorials → Nvidia: I’ll move this post to GB10 Project → Ended (why “all” of these are being done by the community but not Nvidia..?)
  • Someone reported bugs → Nvidia: Please post/send your logs → user: *Post the logs* → Ended with no further follow up/solution or users find the solution by themselves

In my opinion, the spark itself isn’t that bad for developers(if not consider about sm10x/12x difference)
But after the delays and even it is released for about a half year now, why there are still lots of things that need to be fixed.. and some of them are what nvidia excatly use for advertisement.

Nvidia needs to fix the software related isssues. The problem with this being a young platform is Nvidia isn’t young. As a gamer, we relied on Nvidia not only being the best product, but the cleanest product. It was plug and play. That’s changed the last 3 or so years. Nvidia could fix the inference side of things with a decode appliance. It’s the easiest path forward and I think that’s probably coming. Your answer can’t be “buy a Mac” or buy a “Ryzen AI” for disaggregated decode. Because those companies will build prefill machines that seamlessly integrate with them. Hopefully the software support comes with age, the decode/inference speed updates are desperately needed.

completely agreed. Horrible experienc. BNeed to troubleshoot almost everything