DGX Spark - Need resize root partition for headless DGX OS and/or other Linux distros

Well, well, well, have a look at that:

eugr@spark:~$ fastfetch
             .',;::::;,'.                 eugr@spark
         .';:cccccccccccc:;,.             ----------
      .;cccccccccccccccccccccc;.          OS: Fedora Linux 43 (KDE Plasma Desktop Edition) aarch64
    .:cccccccccccccccccccccccccc:.        Host: NVIDIA_DGX_Spark (A.7)
  .;ccccccccccccc;.:dddl:.;ccccccc;.      Kernel: Linux 6.17.1-300.fc43.aarch64
 .:ccccccccccccc;OWMKOOXMWd;ccccccc:.     Uptime: 22 mins
.:ccccccccccccc;KMMc;cc;xMMc;ccccccc:.    Packages: 2421 (rpm)
,cccccccccccccc;MMM.;cc;;WW:;cccccccc,    Shell: bash 5.3.0
:cccccccccccccc;MMM.;cccccccccccccccc:    Display (Unknown-1): 800x600 @ 60 Hz in 10"
:ccccccc;oxOOOo;MMM000k.;cccccccccccc:    DE: KDE Plasma 6.4.5
cccccc;0MMKxdd:;MMMkddc.;cccccccccccc;    WM: KWin (Wayland)
ccccc;XMO';cccc;MMM.;cccccccccccccccc'    WM Theme: Breeze
ccccc;MMo;ccccc;MMW.;ccccccccccccccc;     Theme: Breeze (Light) [Qt], Breeze [GTK2/3]
ccccc;0MNc.ccc.xMMd;ccccccccccccccc;      Icons: Breeze [Qt], breeze [GTK2/3/4]
cccccc;dNMWXXXWM0:;cccccccccccccc:,       Font: Noto Sans (10pt) [Qt], Noto Sans (10pt) [GTK2/3/4]
cccccccc;.:odl:.;cccccccccccccc:,.        Cursor: Breeze (24px)
ccccccccccccccccccccccccccccc:'.          Terminal: /dev/pts/4
:ccccccccccccccccccccccc:;,..             CPU: Cortex-A725*5 + Cortex-X925*5 + Cortex-A725*5 + Cortex-X925*5 (20) @ 3.90 GHz
 ':cccccccccccccccc::;,.                  GPU: NVIDIA Device 2E12 (VGA compatible)
                                          Memory: 4.37 GiB / 119.69 GiB (4%)
                                          Swap: 0 B / 8.00 GiB (0%)
                                          Disk (/): 20.17 GiB / 538.30 GiB (4%) - btrfs
                                          Local IP (enP7s7): 192.168.24.104/24
                                          Locale: en_US.UTF-8

                                                                  
eugr@spark:~$ nvidia-smi
Thu Oct 23 13:07:12 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.95.05              Driver Version: 580.95.05      CUDA Version: 13.0     |
+-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GB10                    Off |   0000000F:01:00.0 Off |                  N/A |
| N/A   38C    P8              3W /  N/A  | Not Supported          |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

So, looks like it’s halfway there, but GUI doesn’t want to go above 800x600 resolution.
But if you are not going to use it as a desktop, doesn’t matter.

The most important test is this:

eugr@spark:~/llama.cpp$ build/bin/llama-cli --list-devices
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
  Device 0: NVIDIA GB10, compute capability 12.1, VMM: yes
Available devices:
  CUDA0: NVIDIA GB10 (122558 MiB, 117541 MiB free)

Getting worse token generation performance than on stock DGX OS, but model loading time improved significantly:

eugr@spark:~/llama.cpp$ build/bin/llama-bench -m /run/media/eugr/root/home/eugr/.cache/llama.cpp/ggml-org_gpt-oss-120b-GGUF_gpt-oss-120b-mxfp4-00001-of-00003.gguf -fa 1 -d 0,4096,8192,16384,32768 -p 2048 -n 32 -ub 2048 -mmp 0
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
  Device 0: NVIDIA GB10, compute capability 12.1, VMM: yes

model size params backend ngl n_ubatch fa mmap test t/s
gpt-oss 120B MXFP4 MoE 59.02 GiB 116.83 B CUDA 99 2048 1 0 pp2048 1864.44 ± 3.08
gpt-oss 120B MXFP4 MoE 59.02 GiB 116.83 B CUDA 99 2048 1 0 tg32 41.79 ± 0.13
gpt-oss 120B MXFP4 MoE 59.02 GiB 116.83 B CUDA 99 2048 1 0 pp2048 @ d4096 1730.84 ± 4.07
gpt-oss 120B MXFP4 MoE 59.02 GiB 116.83 B CUDA 99 2048 1 0 tg32 @ d4096 37.90 ± 0.04
gpt-oss 120B MXFP4 MoE 59.02 GiB 116.83 B CUDA 99 2048 1 0 pp2048 @ d8192 1628.49 ± 7.19
gpt-oss 120B MXFP4 MoE 59.02 GiB 116.83 B CUDA 99 2048 1 0 tg32 @ d8192 36.38 ± 0.10
gpt-oss 120B MXFP4 MoE 59.02 GiB 116.83 B CUDA 99 2048 1 0 pp2048 @ d16384 1395.37 ± 8.78
gpt-oss 120B MXFP4 MoE 59.02 GiB 116.83 B CUDA 99 2048 1 0 tg32 @ d16384 34.23 ± 0.01