What I've learned so far as a "non-tech" Day 1 DGX Spark adopter

Hello,

I’m posting this to share learning, much of which will probably be in the “Captain Obvious” category for engineers and enthusiasts that work with HPC/AI/Linux technology 25/7.

Quick background - I’m from a bygone era of Unix, VMS and other large-scale system architectures that today’s burner phones can outperform before they power-up. While I dabble in modern tech and YouTube daily (learning just enough Linux and python to make me dangerously incompetent with Ollama, LMStudio, N8N and other buzztools), I am prone to getting cut on the edge instead of actually cutting it.

That background out of the way, here’s what I’ve learned (and some ‘what I wish I had known’) insights with my DGX Spark so far, in no particular order except perhaps a relative timeline, and not a prioritized list of learnings. I’m sure a lot of this may fall into the “noob” perception, so if that kind of commentary isn’t a value-add, please skip to the next posting in your to-read list.

  1. The first-time power-up Bluetooth experience will apparently have the DGX Spark latch on to the first thing that identifies as a keyboard. If (like me) you have a DaVinci Speed Editor in range (which identifies as a keyboard), you’ll suddenly face hours of fun trying to complete your setup using a video editing wheel, or desperately trying to find a wired keyboard and mouse. Since the DGX Spark thinks a keyboard has been connected and has satisfied the first-time pairing opportunity, I couldn’t easily put the DGX Spark back into pairing mode. (I thought about trying a command-line alternative to reactivate bluetooth pairing via an SSH session, but that was outside the scope of the NVidia support documentation I could find, and while I have a good idea how to do it in vanilla Ubuntu, I wanted to stick as closely to the NVidia instructions as much as possible for the first few moments / hours.)

  2. When you find a wired keyboard and mouse, they’ll likely have USB-A connectors. Assuming you have USB-A to USB-C adaptors (I did), you’ll quickly learn that unless they’re dongle (cable) adapters, the housing on most non-dongle adaptors are too large to be plugged in next to each other on the back of the DGX Spark. Since the USB-C slot next to the power button is taken up with the power supply, this means it’s really difficult to plug both a keyboard and mouse into the DGX Spark unless both have non-adaptored (is that a word?) USB-C connectors. Fortunately, I had an old Dell keyboard which doubled as a USB-A hub, so I was able to chain the mouse to the keyboard and use only one slot on the back of the DGX Spark. From there, I was able to reactive Bluetooth pairing, disable the DaVinci Speed Editor, and add a Logitech keyboard and mouse.

  3. Remote Desktop access is difficult for…reasons? I access the vast majority of systems in my homelab remotely, so I know my way around RDP and VNC tools sufficiently to function (including the headaches Raspberry Pi owners face with the “Wayland” issue). While the standard Ubuntu menu options for enabling Remote Desktop and Remote Login access were exactly where I expect to find them in NVidia OS, the Remote Login feature incorrectly renders the output into an interlaced, vertically-wrapped mess. From what I was able to find searching on these symptoms independent of any reference to the DGX Spark, I found identical examples which fit into a larger category of Gnome issues (specific to Remote Login functionality). I found some solutions contributed by individuals that involved recompiling NVidia specific libraries with nvcc and specifying specific architectures, however, they were specific to 30-series and 50-series architectures. I was rapidly finding myself outside my depth (at least in terms of how much time I wanted to try and solve this specific problem). I am currently looking into installing XRDP (which is a common solution from what I’ve discovered so far), but where that’s outside of NVidia’s recommendations (at least that I can find so far), I still wanted to look at NVidia-supported alternatives for remote access, which leads me to my next learning…

  4. Some of NVidia’s remote access/development tools don’t install successfully on differing X86 platforms. I decided to install NVidia Sync and NVidia AI Workbench on two of my Windows 11 Pro workstations. NVidia Sync installed successfully on both systems. On one system (a Strix Halo-based mini PC with an existing WSL2 and Docker configuration), NVidia AI Workbench fails to install (saying only that there is a problem, and to try and reinstall again). On the other system (an Intel NUC with built-in Intel graphics, and no pre-existing WSL or Docker configurations), the installation worked right out of the gate. At this stage, I really would rather get an RDP-based functionality up and running on the DGX Spark and keep all of my development, testing and publishing confined to that system, rather than add a layer of abstraction complexity that I have limited time to manage. Before making changes to the DGX Spark outside of the NVidia recommended experience, I wanted to make sure I had proper backup and recovery procedures and assets in hand, which led me to the next learning…

  5. Backup and Recovery on the DGX Spark is different than I would expect in other (production) systems. Having read ahead in the NVidia forums and seen posts such as “I loaded a 65GB model and now my DGX Spark won’t boot” and other examples where in the first 24 hours, numerous other DGX Spark owners are having to go through system recovery efforts, I wanted to get ahead of that potential timeline and put together a recovery USB stick, and set up a backup plan to preserve my system (not just my individual files or work) the same way I would in a Windows or FreeBSD environment. I was able to follow NVidia’s documentation to the recovery download and created a USB recovery key. However, when looking at the situation for backups, all I can initially find within NVidia OS is Deja Dup Backups which seems to focus just on individual user files and not preserving the system as a whole. I’m looking at other tools (or creating my own set of scripts) to do complete backups of the DGX Spark before proceeding to go through the lessons and frankly amazing examples shown on the DGX Spark | Try NVIDIA NIM APIs page, given that I’m seeing posts from others that are running into (in some cases) bricking scenarios, which gives me concern.

  6. System updates from a web page instead of a command line. The NVidia documentation emphasizes that updates should be preferentially performed through the DGX Dashboard instead of a typical “sudo apt update” chain. I was surprised to see this as part of my random startup experience within the system instead of as a highlighted recommendation in the something like the Quick Start guide (maybe I missed it). There will be tools and other 3rd-party functions I could see installing on the DGX Spark that as part of their own recommendations, recommend the “sudo apt update” and upgrade chain as part of their installation processes. Should command-line updates be avoided at all costs? Do command-line updates void the warranty? Again, maybe I’ve missed something, but this seems like a very significant exception to what is otherwise standard (or at least expected?) Linux distribution maintenance.

At some point, I do hope to actually be loading and tuning models, and put a few agents to work. ;-)

I hope these individual insights were helpful. Thank you for your time.

jim180, thanks for the detailed write-up. I’ll share your feedback with our engineering team.

Thank you. Please also share that I still have a lot of enthusiasm for the potential of the platform.

Hi @jim180 @NVES , I’m wondering if DGX spark lets you install things like cloudflared and make a tunnel, so that you can access jupyter hub via the internet?

Do you think anything would prevent doing so?
thanks!

My first day with the Spark was all about figuring out that I needed the usbA to usbC dongles. I had to toss my usbC output keyboard and go back to an older usbA output with dongle to get it to be recoginized. My second day is now all about “No Wi-Fi Adapter Found” i.e. no internet connectivity. I am not sure non-sysadmin peoples are going to succeed at all with this box.

I’ll defer to NVidia support for the official answer, but where one of the examples on the build page includes setting up Tailscale, I would think the goal of surfacing a DGX Spark to another point on the internet (within a Tailscale VPN network) is an expected use case. (I use Tailscale to surface a number of endpoints between different facilities I have within my own Tailscale network, and my DGX Spark will be added to them.)

Sounds good @jim180! that’s good to know!

A follow-up:

  1. I was able to install Timeshift through an Apps menu search. After formatting and attaching a Samsung 1TB external SSD as a Linux ext4 disk, I’m now able to create full and incremental system backups.

  2. Surprisingly, I was able to install XRDP without a hitch, and can now connect remotely to the DGX Spark (with an Xorg environment). There are a few hiccups to iron out with regards to xsession settings being overwritten depending on whether I access the DGX Spark remotely or directly, but it’s manageable. One casualty after getting XRDP up and running seems to have been the default apps pinned (now vanished post-XRDP login) to the left of the Gnome Desktop (when I logged back into the console after the first XRDP session), but they were easy enough to restore / change. I suspect some xsession settings file(s) are getting over-written by the most recent window manager that touches them.

  3. I was finally able to start going though some of the suggested playbooks including getting OpenWebUI up and running with a few models. At this point, I’m going to shift over to XRDP as my primary means of accessing the DGX Spark and running through more of the playbooks remotely.

I use RustDesk for remote access. I also use Tailscale and can now access my Spark from anywhere on the Internet, automatically encrypted.

The Spark is so new that a number of the canned examples (eg Nim packages) don’t work because they were written before the Spark was available. It’s hard to know what works and what doesn’t without trying, but part of the problem is CUDA and PyTorch version numbers, as well as the Spark’s 120 compute capability, which didn’t exist before and is not in the code for many packages.

I had looked at RustDesk earlier this year but found it to be more complicated than I had time to figure out, but perhaps its time to give it another look. Thanks for the tip.

Let me add a bit pain to the keyboard too. During the first boot (when it asked to connect keyboard and mouse) I connected my wireless logitech k400r (just cheapest keyboard with touchpad) using usb-c to usb-a adapter. Keyboard and touchpad was working, I was able to move mouse pointer and switch between consoles (ctrl+alt+f1-8). But installer stuck in “Please connect the keyboard” message and didn’t react on anything. Also there were no terminal anywhere, so I can’t take a look what exactly happening. I was able to connect to wireless network and do setup using my laptop.

Not a big deal, but I feel it could be fixed easily in the next batches.

Regarding loading 65GB model and getting stuck. It sounds weird to me. I was loading bunch of the model with ollama just for testing including oss-gpt, which I believe this 65G. Also deepseek and qwen (did some benchmarking). Also did quantization for deepseek-ai/DeepSeek-R1-Distill-Qwen-32B (fp16) (~62GB) to nvfp4, which is native to DGX spark. No issues at all.

But what is funny, when I tried to insert USB drive into the DGX Spark I had to take off the keyboard dongle because my USB drive has a sliding right cover, which didn’t let me to use dongle and the USB drive at the same time. Luckily I was using keyboard only during setup and did all other operations over ssh.

out of interest, has anyone used the USB C hub to connect a wired keyboard and mouse? Given the comments about the space between USB C ports?

I plugged a USB-C hub that has USB-C, USB-A and HDMI ports while I was trying to figure out my blank screen won’t wakeup issue. Wired keyboard/mouse worked fine that way. The no wakeup issue wasn’t caused by the bluetooth keyboard/mouse. I had to go to power settings and change the screen timeout to ‘never’ until a fix for the wakeup problem is published.

The garbled remote desktop image is a known issue that there’s a fix for upstream: https://gitlab.gnome.org/GNOME/gnome-remote-desktop/-/merge_requests/344

Here is the link for tracking this getting merged into the 24.04 noble updates.

Bug #2127792 “Images are corrupted on blackwell GPUs” : Bugs : gnome-remote-desktop package : Ubuntu

I wonder anyone has tried the Network Appliance Mode for initial setup from herehttps://docs.nvidia.com/dgx/dgx-spark/first-boot.html”
Network Appliance Mode

  1. Power on the system. This creates a Wi-Fi hotspot that you will use to connect to the system and continue the setup process. The SSID and password for the Wi-Fi hotspot are printed on a sticker attached to the Quick Start Guide included with your DGX Spark’s packaging

  2. From another computer, connect to the Spark’s Wi-Fi hotspot using the SSID and password provided on the Quick Start Guide. A captive portal page will open in the default web browser on your computer. If it does not open automatically, use your browser to navigate to the Spark’s system setup page listed on the Quick Start Guide.

  3. Follow the on-screen prompts to continue the setup process.

    When the DGX Spark joins your home network, its Wi-Fi hotspot will turn off and your computer will reconnect to the device through your Wi-Fi network to resume the setup process. If you are not able to connect to the DGX Spark after it joins your home network, you must connect a display/keyboard/mouse to continue.

In my case, I gave it a few moments of thought before diving ahead and just plugging it into my wired network, adding a monitor, and eventually a keyboard (see my earlier comments in this thread on the keyboard sitcom). I’m not ashamed to admit that I looked at the sticker on the front page of the Quick Start booklet and thought, “Why do I want another wi-fi network when this is going into my homelab rack?” (It has since migrated to my desk with a GbE hub.)

A link (on the Quick Start cover) to the page you referenced or a YouTube video link that outlined the procedure would have caught my attention. (Not a criticism, just feedback. I know you’ve all put a lot of work into this system and getting it launched.)

I’m almost curious enough to consider reflashing my Spark and starting over to see what the first boot experience is like. (But not that curious.) ;-)

Thanks again for your insights and help.

I followed the Network Appliance Mode process. All went well until the updates, when the progress bar stalled out about 2/3 of the way through, presumably when the Spark rebooted. The JavaScript console showed some failed requests at that point.

I was able to SSH to spark-xxxx.local and add the Spark to NVIDIA Sync using the spark-xxxx.local DNS address at that point, though, so it wasn’t a blocker.

jim180 - My background and experience are similar to yours, resolved issues in the same way. I was able to at least do a backup to a NFS volume, yet I am not sure just what needs backing up. If there is a RTM somewhere, please point me to it. Are there any nVIDIA recommended procedures for backup and recovery? Much research and setting up to do, the first set of updates foo-bared time set up. SSH seems to work and I have the keys exchanged, but not right. Will take a look a XRDP, spending to much time on OS integration and not AI . Thanks for the feedback.