DGX Spark Completely Inoperable - Need Help (USB Boot Fails, UEFI Inaccessible, System Frozen)

TL;DR: Brand new DGX Spark became a $10K+ brick within 24 hours. System crashes during routine apt upgrade, then Docker container causes complete freeze. Cannot access UEFI reliably, USB boot completely fails, all recovery methods non-functional. 12+ hours troubleshooting, system is unusable. Has anyone else experienced this?


Background

Purchased DGX Spark for local LLM inference (Kimi K2.5, DeepSeek, etc.) and blockchain node operations. This is my first NVIDIA enterprise product - previously used Mac Studio and cloud GPUs.

System arrived Wednesday. Initial setup went fine, got through the first-boot wizard, everything looked beautiful. Then things went catastrophically wrong.


Day 1: The Update That Broke Everything

Thursday morning, January 29, 2026

Did what any reasonable person does with a new Linux system:

sudo apt update && sudo apt upgrade -y

Standard stuff, right? Mid-update, the system spontaneously rebooted. No warning, no error message, just… gone.

When it came back up, I had to reconfigure EVERYTHING:

  • Keyboard mapping

  • Mouse settings

  • Display configuration

  • Network settings

It was like the system forgot it had been set up at all.

Boot logs showed these errors:

platform NVDA8800:00: failed to claim resource 0: [mem 0x05170000-0x051c...]
acpi NVDA8800:00: platform device creation failed: -16
platform NVDA8900:00: failed to claim resource 0: [mem 0xc8000000-0xd7ff...]
acpi NVDA8900:00: platform device creation failed: -16

ACPI resource conflicts on a brand-new, out-of-box NVIDIA system? For NVIDIA devices? That seemed… wrong.

Spent 2 hours adding kernel parameters to work around it:

GRUB_CMDLINE_LINUX_DEFAULT="console=tty0 console=ttyS0,921600 pci=realloc pci=nocrs acpi=force iommu=pt nvidia-drm.modeset=1"

Got it stable-ish. Figured maybe it was just a quirk. Moved on.


Day 2: The Crash That Killed Everything

Friday morning, January 30, 2026

Time to actually use this thing for what I bought it for. Tried deploying Kimi K2.5 locally via vLLM in Docker:

docker run -d \
  --name kimi-k2.5 \
  --gpus all \
  --restart always \
  -p 8000:8000 \
  -v ~/models/kimi-k2.5:/model \
  vllm/vllm-openai:latest \
  --model /model \
  --host 0.0.0.0 \
  --port 8000 \
  --trust-remote-code

Standard vLLM deployment. Nothing exotic.

System immediately crashed. Complete freeze. Had to hard reset (held power button).

On reboot: System completely inoperable.

What I see:

  • Ubuntu login screen appears

  • Can type password (keyboard works!)

  • Press Enter…

  • Desktop starts to load…

  • Complete freeze

  • Mouse cursor frozen

  • Keyboard stops responding

  • Nothing works

Can’t access:

  • ❌ Ctrl+Alt+F1-F6 (TTY consoles) - No response

  • ❌ Ctrl+Alt+T (terminal) - No response

  • ❌ Ctrl+Alt+Backspace - No response

  • ❌ Any keyboard shortcuts - No response

SSH: Connects for maybe 10-15 seconds after boot, then freezes. Not enough time to run commands.

The Docker container with --restart always is launching on boot, crashing the system, preventing me from disabling it. Classic catch-22.


The 12-Hour Troubleshooting Odyssey

Attempt 1: GRUB Recovery Mode

Tried to access GRUB:

  • Hold Shift during boot - Doesn’t work

  • Tap Esc repeatedly - Works maybe 1 in 20 times

  • Hold Shift while pressing power - Doesn’t work

  • Ctrl+Alt+Del then Esc - Doesn’t work

When I DID get GRUB once:

  • Got to grub> prompt

  • Typed normal - system booted to frozen login again

  • Tried editing boot parameters (press ‘e’):

    • Added systemd.unit=rescue.target - System ignored it

    • Added single init=/bin/bash - System ignored it

    • Added systemd.unit=multi-user.target - System ignored it

Boot modifications are not being applied. Why?

Attempt 2: UEFI/BIOS Access

Keys I’ve tried to access UEFI:

  • F1, F2, F8, F10, F11, F12, Del, Esc

  • Hold methods, tap methods, various timing combinations

  • Pressed every key imaginable during boot

Success rate: ~10% (got in maybe 2-3 times out of 25+ attempts)

When I DID get into AMI Aptio Setup:

  • Boot tab shows: “Boot Option #2: UEFI PXE IPv4…”

  • Boot Option #1 is missing (where’s the internal drive?)

  • USB drives never appear in boot options (more on this below)

  • Can’t find Secure Boot setting anywhere

For a $10K+ system, I expect UEFI access to work 100% of the time, not 10%.

Attempt 3: USB Recovery Boot

Created Ubuntu 24.04.3 LTS bootable USB:

  • Used dd on macOS: sudo dd if=ubuntu.iso of=/dev/rdisk6 bs=1m

  • Verified completion: 753+0 records in/out, 789,577,728 bytes transferred

  • Checked structure on Mac: EFI partition present, bootable structure confirmed

  • This USB boots fine on my MacBook and other systems

Tried to boot from USB on DGX:

  • ❌ USB never appears in UEFI boot options (tried multiple times)

  • ❌ Changed boot priority to USB #1 in UEFI - system still boots to internal drive

  • ❌ F11 boot menu - Never appears

  • ❌ F12 boot menu - Never appears

  • ❌ F8 boot menu - Never appears

  • Tried front USB ports - Not detected

  • Tried rear USB ports - Not detected

  • Tried USB 2.0 ports - Not detected

  • Tried USB 3.0 ports - Not detected

  • Tried different USB drives - None detected

The DGX Spark cannot detect USB boot media at all. This makes standard recovery impossible.

Attempt 4: Different Keyboards/Mice

Thought maybe it was peripheral compatibility:

  • Swapped to wired USB keyboard - Same issue

  • Swapped to wireless keyboard (different brand) - Same issue

  • Tried different USB ports - Same issue

  • Multiple mice tested - Same issue

Keyboard works perfectly for password entry, then complete lockup. This suggests the freeze happens during/after display manager initialization.

Attempt 5: SSH Recovery Loop

Since SSH worked briefly, tried automated recovery:

while true; do 
  ssh wulfkaal@10.0.0.186 "sudo systemctl stop docker; sudo systemctl disable docker" && break
  sleep 1
done

Connected 3-4 times, but connection drops in <15 seconds before commands complete. Not enough time to disable Docker.


Current Status: Completely Bricked

What I have:

  • ✅ Expensive green paperweight

  • ✅ System that boots to a frozen login screen

  • ✅ Keyboard that works for exactly 10 seconds

  • ✅ 12+ hours of wasted research time

  • ✅ Zero productive work accomplished

What I don’t have:

  • ❌ Access to UEFI when I need it

  • ❌ USB boot capability

  • ❌ TTY console access

  • ❌ Recovery options that work

  • ❌ A functioning AI workstation


Technical Analysis

Issue 1: ACPI Resource Failures

The boot logs show systematic ACPI resource allocation failures for NVIDIA devices. These errors appear every boot:

platform NVDA8800:00: failed to claim resource 0
acpi NVDA8800:00: platform device creation failed: -16

This suggests firmware-level issues with device initialization.

Issue 2: USB Controller Failure

USB devices are completely invisible to UEFI boot options. I’ve created verified bootable USBs that work on other systems, but the DGX cannot see them at all. This matches the pattern I found searching these forums: [“DGX Spark is Inoperable: Failed USB Controller/Firmware”](link if available)

Issue 3: GRUB/Bootloader Instability

GRUB access is inconsistent (10% success rate), and when accessible, boot parameter modifications are ignored. This isn’t normal behavior for Ubuntu systems.

Issue 4: Post-Login Freeze

System freezes specifically after authentication, suggesting GPU driver or display manager initialization issue. The timing is consistent: password works → desktop starts loading → complete freeze.


Questions for the Community

Has anyone else experienced this?

Specifically:

  1. USB boot failures - UEFI not detecting bootable USB drives?

  2. UEFI access inconsistency - F2/Del/Esc only working occasionally?

  3. Post-login freeze - System locks up after password entry?

  4. ACPI resource errors - Platform device creation failures on boot?

  5. Docker container crashes - vLLM or other GPU containers causing system freeze?

Questions:

  • Is there a known firmware version that fixes USB detection?

  • Is there a special key combination for UEFI I’m missing?

  • Has anyone successfully recovered from a similar state?

  • Should USB boot just… work? Or is there a setting I’m missing?

  • Is this a known issue with DGX Spark? (I’m seeing hints in old forum posts)


What I Need

Immediate:

  • Way to disable Docker service without GUI/SSH access

  • Reliable method to access UEFI/BIOS

  • Solution for USB boot detection failure

  • Or… any other recovery method I haven’t tried

Long-term:

  • Firmware update addressing these issues?

  • Confirmation this is a known problem?

  • Timeline for fixes?


My Setup

  • Model: DGX Spark

  • GPU: NVIDIA GB10 (Blackwell)

  • OS: DGX OS (Ubuntu 24.04 based) - out of box, minimal changes

  • Purchase: January 2026

  • Use case: Academic AI research (LLM inference, blockchain nodes)

  • IP: 10.0.0.186 (for reference)


Why This Matters

I’m not just annoyed - I’m genuinely concerned:

  1. Research Impact: This was purchased for time-sensitive academic research. Every day it’s down is a day of lost productivity.

  2. Recovery Design Flaw: A professional AI workstation should have reliable recovery options. USB boot is the standard recovery method for Linux systems. If that doesn’t work, what’s the fallback?

  3. Update Stability: A system marketed for AI development should handle standard apt upgrade without catastrophic failure.

  4. Community Pattern: Searching these forums, I see similar reports (USB boot issues, UEFI access problems, post-update instability). Are these isolated incidents or a systemic issue?


What I’ve Learned

For others considering DGX Spark:

Don’t run apt upgrade without backups
Don’t use Docker --restart always flag
Don’t expect USB recovery to work
Don’t expect UEFI access to be reliable
Don’t expect standard Linux recovery procedures to work

Do have enterprise NVIDIA support contract
Do have someone who can physically access the system
Do consider alternatives (Lambda Vector, HP/Dell workstations, custom builds)


Request to NVIDIA

If NVIDIA support is reading this:

I need help today. I cannot wait for firmware updates or lengthy support processes. This system is:

  • Defective out of box (USB boot failure)

  • Unstable (crashes from standard operations)

  • Unrecoverable (all standard recovery methods fail)

  • Unfit for purpose (cannot run AI workloads)

I need either:

  1. Immediate remote assistance to recover the system

  2. RMA/replacement unit with verified firmware

  3. Full refund so I can purchase reliable hardware

I’m an academic researcher, not an enterprise customer with an IT department. I need a workstation that works.


Call to Community

If you’ve experienced similar issues:

  • Please reply with your experience

  • Share any workarounds you’ve found

  • Confirm if this is a known issue

  • Help me validate this isn’t just “user error”

If you work at NVIDIA:

  • Please escalate this

  • Please confirm if USB boot is a known issue

  • Please provide timeline for firmware fixes

  • Please help me recover this system

If you’re considering DGX Spark:

  • Read this carefully

  • Search forums for similar issues

  • Consider alternatives

  • Have a backup plan


Update History

January 30, 2026 (13:00 CST): Initial post
January 30, 2026 (18:00 CST): Attempted USB recovery - failed
January 31, 2026: [will update with any progress]


I want to love this system. The hardware is beautiful. But right now, it’s unusable.

Has anyone else had these issues? Am I the only one? Please help.


System Info for NVIDIA Support:

  • Hostname: spark-76b8

  • IP: 10.0.0.186

  • Purchase: January 24, 2026

  • Order #155970

  • Logs available on request

#DGXSpark #NVIDIASupport #RecoveryHelp #USBBootFailure #UEFIIssues

Hi sorry to see you are running into issues. First off, which brand of Spark do you have? Is it Founder’s Edition or an OEM variant? Next, can you explain what you mean when you say you tried both front and rear USB ports? The Spark has USB ports on only one side, and they are all USB-C.
Also, any screenshots you can get of the UEFI would be very helpful so we can better understand why you are not seeing some of the information that is there.
I would recommend you read the User Guide on how to reimage your system via USB with the DGX Spark OS: System Recovery — DGX Spark User Guide

Founders edition Spark.
I cannot get to the UEFI with everything I tried for 15 hours. I read everything in there and nothing has worked not combination of Keys no USB - nothing.

Sounds like your keyboard/mouse/usb peripherals are not recognized by Spark during boot. Do you have a spare keyboard/mouse/usb available to try?

Indeed, I tried 3 different keyboards and 3 different mouses AND I bought a new keyboard (number 4) that is fully Linux compatible.
Not one of 4 (I repeat 4) different keyboards worked beyond password. It always shorts out in the next setup phase.

If no peripherals work, then triage will be limited. I read you purchased the unit less than a week ago; recommend you contact your point of purchase and ask for refund.

If you cannot do that, contact my colleagues at NVIDIA DGX Spark Support , reference this thread, and we can help you with an RMA. Note, we cannot help you with a refund. That is with your seller.

I received the DGX this Wednesday (2 days ago). I would have been grinding it out all the way if the peripherals had worked. Appreciate any support for a refund, which I already requested ..

You seem like a linux expert with all of the linux customization you’re making! Super impressive.

I might recommend leaving the OS a little more ‘stock’ and introducing your changes slower.

The crash you experienced while running Kimi is the typical OOM experience on the Spark.

It’s a one trillion parameter model and something north of 600 GB of model weight alone. Were you hoping it to fit into 128GB of shared memory on the Spark? I haven’t looked into it, but maybe there’s something from the unsloth boys by now (they did the 1.5bit quant thing for another model earlier) - but the native weights would need very many sparks clustered.

Good luck and happy hacking!