DGX Spark - Need resize root partition for headless DGX OS and/or other Linux distros

I need to resize the root partition of 4TB to four 1TB so I can have another headless DGX OS without Desktop UI and try out other distro. So far the recovery media does not provide way to start in try mode like other distro to resize. Rebooting with a USB drive with Ubuntu ARM server or desktop image fails.

Q1. Is there a boot up image of DGX OS I can use?
Q2. Is there any other distro that works?
Q3. Will you have DGX Server (headless) image soon?

@Neurfer if you want a headless setup just start your DGX Spark in multi-user mode. Run systemctl set-default multi-user.target and reboot.

For installing other distros don’t forget that you’ll have to hack the device trees (DTS) to boot.

Precisely why I need more than one partition. One for standard, another headless one to remove all GUI junk keeping it lean and mean, and others for other distros, and build and test my own sweet J.A.R.V.I.S. OS :)

Sorry, I misunderstood your question. You want multiple installs of DGX OS. I thought you were referring to other Linux distributions like RHEL, ArchLinux, etc.

You can boot from a USB drive and use resize2fs and a partition editor to shrink the DGX OS partition, but it’s not really a supported configuration. If you want to mess with that kind of thing, definitely back up your data first and it wouldn’t hurt to have the DGX OS recovery image handy in case something goes wrong.

If you just want to have a boot entry that doesn’t launch the GUI, you should be able to add a GRUB menu entry that adds systemd.unit=multi-user.target or whatever other target unit you want. That would be easier and safer than messing with the partitions.

One of the other forum posts mentioned a solution someone tried. You can review this and see if it might be applicable to your needs.

Note: This is not an officially provided solution by NVIDIA but something a user contributed.

Setting up your DGX SPARK for Remote Virtual Desktop (with Headless Sunshine Setup) - DGX Spark / GB10 User Forum / DGX Spark / GB10 Projects - NVIDIA Developer Forums

Haha, that’s exactly I am trying to do. But NVidia is not providing a bootable image to “boot from a USB drive”. The Recovery Media literary does one thing: reimage. All other distros’ images failed to boot. I even tried with Ventoy and Penguins’ Egg to build my own bootable image, no success so far. And no I am not worried about loss of data. I would not trust enough to put important data in a new pilot device that I don’t have a solid proven procedure to recover from. Which is also the reason why I tried to delete the ridiculous solitaire game that was included and found out that destroy the entire OS: DGX Spark - don’t remove games? - #7 by Neurfer

Sure I can spend weeks and weeks to figure it out on my own, but I don’t have NVidia resource, and I am sure DGX Spark developers are NOT reimaging their test devices everytime a test messed it up. So there must be a bootable image circulating internally. All I am really asking is share it with the community who are willing to experiment and contribute. We all understand a new product (pilot) is not a something you can or should be doing it all along :)

Also whatever that Sunshine guy is doing isn’t it.

Goal is simple: We Linux engineers have been doing this forever. That’s why grub exists.

Goal: Install multiple OS images in its own partition so the user can choose which one to start at the boot time. via grub menu.

What I did was mount a tmpfs at /run/nextroot, extract an Arch Linux ARM image to it, and then use systemctl soft-reboot. At that point the rootfs is unmounted and you can resize the partition and install the new system from there. I haven’t tried constructing an Arch Linux ARM boot flash drive image that includes the DGX OS kernel but it ought to be possible.

This is way off into power user territory for this product so just be aware that you’re forging the path here.

Excellent suggestion! I can try that. Also that mean maybe I can even use Arch Linux’s live image. But once I create multiple partitions, I still need a working bootable image that allow me to choose which partition to install another DGX OS.

Dell has two ISO images (US + WW) for their Dell Pro Max GB10 boxes.

You could probably void your warranty just by looking at it, but it might be a good starting point.
Firmware (packages) will most likely be different. Just a had look into it. Seems that Dell is more experienced with this kind of desktop machines. 😉

ASUS (the box I’m still waiting for) seems also to offer an ISO - Wow. Last time I checked there was only a Safety PDF.

Arch Linux doesn’t support ARM and I don’t think Arch Linux ARM has bootable disk images with EFI boot support.

If you want to use the DGX Spark recovery image, my coworker pointed me at the file named sgdisk.txt.example that gets written to the thumbdrive. If you rename that to sgdisk.txt, it will use that for partitioning instead of the default of using the whole NVMe drive. So rather than shrinking an existing DGX OS partition, if you’re okay with restoring the whole device you can use that to adjust the partition layout. Just be aware that the recovery image will wipe the entire drive so back up anything you want to keep first.

That is a great idea. These machines are supposed to be all same (the specs looks like it), right? Same CPU, same GPU, same chip-set, all same motherboard, but the case. But yea, I don’t want my warranty get void
 unless anyone from NVIDIA can say for sure Dell’s image will work or not?

On sgdisk.txt.example idea, looking at the file, it looks like I might be able to do the following.

1st image on USB #1:
sgdisk -Z -n 1:2048:612351 -t 1:ef00 -n 2:612352:+1T /dev/nvme0n1
This will wipe the current like the default recovery but will creat only 1 EFI and 1TB partition leaving 3GB unused.

2nd image on USB #2:
sgdisk -n 3::+ -t 3:ef00 -n 4::+1T /dev/nvme0n1
without -Z, it won’t wipe and just create partition 3 and 4 but will it install on these partitions or override the previous ones?

Thanks @aplattner, sgdisk.txt provided the entry to the solution. I now have 3+ partitions: Main Desktop (1TB), Headless Server(1TB), and Golden image(40GB).

I created a draft script that a create additional test instance from the Golden Image in < 10 min per each with new UUID, hostname, and background with its hostname. The new instance shows up in Grub menu at boot-up. It took 12+ reimagings, shuffling partitions and clones around and several sleepless nights, but it’s working flawlessly now.

The script can be destructive and dangerous, and not fully tested, so I will share it once I get sometime to test more and document it.

While I would love to set my next step to generate a bootable live golden ISO image for live, installation, or additional instances of DGX OS partition, I rather focus on my AI/ML experiments. And I am sure the brilliant engineers of NVIDIA will come up with a better solution like providing a standard Live ISO image in next few days anyway.

@cosinus, FYI, Dell’s image boot up better than other distros, but wasn’t good enough at the end. In a glance, it seems having a problem with sound driver, but really didn’t bother to test further nor wanted risk, and my solution works great now :)

Here’s my Grub screenshot. os-prober detected the additional DGX OSs as Ubuntu 24.04.

FWIW, I was able to boot up Fedora 43 beta from live USB. It works only in “basic graphics mode” which looks like 800x600 and complains about chipset, but at least it boots. I think I’ll just install it on an external SSD and see what can be done about it.

Well, well, well, have a look at that:

eugr@spark:~$ fastfetch
             .',;::::;,'.                 eugr@spark
         .';:cccccccccccc:;,.             ----------
      .;cccccccccccccccccccccc;.          OS: Fedora Linux 43 (KDE Plasma Desktop Edition) aarch64
    .:cccccccccccccccccccccccccc:.        Host: NVIDIA_DGX_Spark (A.7)
  .;ccccccccccccc;.:dddl:.;ccccccc;.      Kernel: Linux 6.17.1-300.fc43.aarch64
 .:ccccccccccccc;OWMKOOXMWd;ccccccc:.     Uptime: 22 mins
.:ccccccccccccc;KMMc;cc;xMMc;ccccccc:.    Packages: 2421 (rpm)
,cccccccccccccc;MMM.;cc;;WW:;cccccccc,    Shell: bash 5.3.0
:cccccccccccccc;MMM.;cccccccccccccccc:    Display (Unknown-1): 800x600 @ 60 Hz in 10"
:ccccccc;oxOOOo;MMM000k.;cccccccccccc:    DE: KDE Plasma 6.4.5
cccccc;0MMKxdd:;MMMkddc.;cccccccccccc;    WM: KWin (Wayland)
ccccc;XMO';cccc;MMM.;cccccccccccccccc'    WM Theme: Breeze
ccccc;MMo;ccccc;MMW.;ccccccccccccccc;     Theme: Breeze (Light) [Qt], Breeze [GTK2/3]
ccccc;0MNc.ccc.xMMd;ccccccccccccccc;      Icons: Breeze [Qt], breeze [GTK2/3/4]
cccccc;dNMWXXXWM0:;cccccccccccccc:,       Font: Noto Sans (10pt) [Qt], Noto Sans (10pt) [GTK2/3/4]
cccccccc;.:odl:.;cccccccccccccc:,.        Cursor: Breeze (24px)
ccccccccccccccccccccccccccccc:'.          Terminal: /dev/pts/4
:ccccccccccccccccccccccc:;,..             CPU: Cortex-A725*5 + Cortex-X925*5 + Cortex-A725*5 + Cortex-X925*5 (20) @ 3.90 GHz
 ':cccccccccccccccc::;,.                  GPU: NVIDIA Device 2E12 (VGA compatible)
                                          Memory: 4.37 GiB / 119.69 GiB (4%)
                                          Swap: 0 B / 8.00 GiB (0%)
                                          Disk (/): 20.17 GiB / 538.30 GiB (4%) - btrfs
                                          Local IP (enP7s7): 192.168.24.104/24
                                          Locale: en_US.UTF-8

                                                                  
eugr@spark:~$ nvidia-smi
Thu Oct 23 13:07:12 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.95.05              Driver Version: 580.95.05      CUDA Version: 13.0     |
+-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GB10                    Off |   0000000F:01:00.0 Off |                  N/A |
| N/A   38C    P8              3W /  N/A  | Not Supported          |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

So, looks like it’s halfway there, but GUI doesn’t want to go above 800x600 resolution.
But if you are not going to use it as a desktop, doesn’t matter.

The most important test is this:

eugr@spark:~/llama.cpp$ build/bin/llama-cli --list-devices
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
  Device 0: NVIDIA GB10, compute capability 12.1, VMM: yes
Available devices:
  CUDA0: NVIDIA GB10 (122558 MiB, 117541 MiB free)

Getting worse token generation performance than on stock DGX OS, but model loading time improved significantly:

eugr@spark:~/llama.cpp$ build/bin/llama-bench -m /run/media/eugr/root/home/eugr/.cache/llama.cpp/ggml-org_gpt-oss-120b-GGUF_gpt-oss-120b-mxfp4-00001-of-00003.gguf -fa 1 -d 0,4096,8192,16384,32768 -p 2048 -n 32 -ub 2048 -mmp 0
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
  Device 0: NVIDIA GB10, compute capability 12.1, VMM: yes

model size params backend ngl n_ubatch fa mmap test t/s
gpt-oss 120B MXFP4 MoE 59.02 GiB 116.83 B CUDA 99 2048 1 0 pp2048 1864.44 ± 3.08
gpt-oss 120B MXFP4 MoE 59.02 GiB 116.83 B CUDA 99 2048 1 0 tg32 41.79 ± 0.13
gpt-oss 120B MXFP4 MoE 59.02 GiB 116.83 B CUDA 99 2048 1 0 pp2048 @ d4096 1730.84 ± 4.07
gpt-oss 120B MXFP4 MoE 59.02 GiB 116.83 B CUDA 99 2048 1 0 tg32 @ d4096 37.90 ± 0.04
gpt-oss 120B MXFP4 MoE 59.02 GiB 116.83 B CUDA 99 2048 1 0 pp2048 @ d8192 1628.49 ± 7.19
gpt-oss 120B MXFP4 MoE 59.02 GiB 116.83 B CUDA 99 2048 1 0 tg32 @ d8192 36.38 ± 0.10
gpt-oss 120B MXFP4 MoE 59.02 GiB 116.83 B CUDA 99 2048 1 0 pp2048 @ d16384 1395.37 ± 8.78
gpt-oss 120B MXFP4 MoE 59.02 GiB 116.83 B CUDA 99 2048 1 0 tg32 @ d16384 34.23 ± 0.01

For comparison, this is what I’m getting with DGX OS. However, model loading on DGX OS is significantly slower. We are talking 56 seconds on DGX OS vs. 19 seconds on Fedora 43. Both times the model was loaded from DGX OS SSD (not the one I used for Fedora). But overall performance is higher on DGX OS - I wonder what optimizations are not available in the mainline CUDA/kernel yet.

model size params backend ngl n_ubatch fa mmap test t/s
gpt-oss 120B MXFP4 MoE 59.02 GiB 116.83 B CUDA 99 2048 1 0 pp2048 1901.04 ± 7.00
gpt-oss 120B MXFP4 MoE 59.02 GiB 116.83 B CUDA 99 2048 1 0 tg32 56.83 ± 0.38
gpt-oss 120B MXFP4 MoE 59.02 GiB 116.83 B CUDA 99 2048 1 0 pp2048 @ d4096 1813.11 ± 8.25
gpt-oss 120B MXFP4 MoE 59.02 GiB 116.83 B CUDA 99 2048 1 0 tg32 @ d4096 51.72 ± 0.16
gpt-oss 120B MXFP4 MoE 59.02 GiB 116.83 B CUDA 99 2048 1 0 pp2048 @ d8192 1721.38 ± 7.84
gpt-oss 120B MXFP4 MoE 59.02 GiB 116.83 B CUDA 99 2048 1 0 tg32 @ d8192 47.58 ± 0.36
gpt-oss 120B MXFP4 MoE 59.02 GiB 116.83 B CUDA 99 2048 1 0 pp2048 @ d16384 1515.54 ± 4.80
gpt-oss 120B MXFP4 MoE 59.02 GiB 116.83 B CUDA 99 2048 1 0 tg32 @ d16384 44.94 ± 0.67
gpt-oss 120B MXFP4 MoE 59.02 GiB 116.83 B CUDA 99 2048 1 0 pp2048 @ d32768 1243.26 ± 2.29
gpt-oss 120B MXFP4 MoE 59.02 GiB 116.83 B CUDA 99 2048 1 0 tg32 @ d32768 39.40 ± 0.81

@eugr how do you connect your display? I use the HDMI port and the monitor is detected okay. It’s a Dell 4k panel. Here’s what fastfetch reports:

Display (DELL S2725QS): 3840x2160 @ 1.5x in 27", 60 Hz [External]

What OS/kernel are you running? I connect my portable FHD display via HDMI. Works fine with DGX OS and with other machines, including Fedora 43 x86. But Fedora 43 aarch64 only works in VGA 800x600 mode, despite NVidia open drivers loaded.