DGX Spark: CUDA Install Pitfalls on Ubuntu 24.04 (ARM64) – FIXED

Issue

Installing the CUDA toolkit on a DGX Spark (Grace + Blackwell, ARM64/SBSA) running Ubuntu 24.04 either failed or produced a broken setup (no nvcc, compile errors, or the installer aborting). nvidia-smi showed GPUs and driver 580.95.05 fine, but local CUDA builds weren’t possible.

On a DGX Spark (Grace + Blackwell, ARM64/SBSA) running Ubuntu 24.04, getting a working CUDA toolkit (with nvcc) turned into a trap maze. The GPUs were visible and healthy (nvidia-smi showed Driver 580.95.05 and CUDA 13.0 runtime), but the toolkit itself wasn’t present—no /usr/local/cuda-*, no compiler, nothing to build against. Multiple “official” paths either failed outright or produced an unusable toolchain:

  • The Ubuntu 24.04 ARM64 CUDA repo was reachable, but the usual meta-packages (cuda-toolkit, cuda-toolkit-13-0, cuda-13-0) simply weren’t published at that time, so apt couldn’t install anything meaningful.

  • The NVIDIA download UI often surfaced cross-SBSA (cross-compile) or RHEL RPM flows when selecting ARM64, which are not suitable for native Ubuntu installs; the deb (local) variant for Ubuntu 24.04 ARM64 returned 404.

  • Installing CUDA 12.2 “worked” in the sense that files landed under /usr/local, but compilation failed on Ubuntu 24.04 because GCC 13 / newer glibc headers expose types (e.g., SVE intrinsics) that CUDA 12.2’s toolchain can’t parse—even when forcing g++-12 or using -allow-unsupported-compiler.

  • The CUDA 13.x runfile includes a driver by default; if you miss unchecking it, the installer attempts a driver install and aborts, despite the system already having a newer, compatible driver via DGX/NVIDIA repos.

What this looked like in practice
You could see the GPU and a “CUDA Version: 13.0” in nvidia-smi, which implied CUDA was “there,” but that value reflects the driver’s runtime capability, not the presence of the toolkit. Without nvcc and the headers/libraries, any local CUDA build failed immediately. Attempts to “do it the right way” via Ubuntu packages or older toolkits led to dead ends (missing apt packages, 404 installers, or compiler/runtime mismatches). The reliable, architecture-correct path turned out to be the universal ARM64 SBSA runfile for CUDA 13.0, installed toolkit-only (leaving Driver 580 in place) under /usr/local/cuda-13.0, followed by a clean environment setup. Once installed this way, deviceQuery, vectorAdd, and custom kernels built and ran as expected.

System

  • Platform: NVIDIA DGX Spark (Grace + Blackwell), ARM64/SBSA (aarch64)

  • OS: Ubuntu 24.04.3 LTS

  • Driver: 580.95.05

  • Goal: Working CUDA toolkit + compiler (nvcc) for native builds

Symptoms

  • nvidia-smi showed CUDA Version: 13.0 (driver runtime), but no nvcc.

  • /usr/local/ had no cuda-* dirs.

  • CUDA 12.2 installed but failed to compile on Ubuntu 24.04 (GCC/glibc headers errors).

  • Ubuntu 24.04 apt repo did not expose expected toolkit metas (cuda-toolkit, cuda-toolkit-13-0, cuda-13-0) at that time.

  • CUDA 13.0 runfile tried to install a driver if not explicitly unchecked, then bailed.

Root Cause(s)

  1. Download picker traps: surfacing cross-sbsa (cross-compile) or RPM/RHEL packages instead of native ARM64 Ubuntu runfile.

  2. CUDA 12.2 vs Ubuntu 24.04 toolchain: GCC 13 / newer glibc headers (SVE types) break 12.2 builds even with g++-12.

  3. Repo lag: Ubuntu 24.04 ARM64 CUDA repo reachable but toolkit metas not present.

  4. Installer defaults: runfile bundles a driver; must uncheck it on systems that already have the correct driver.

Dead Ends Encountered

  • Ubuntu 24.04 ARM64 “deb (local)”: 404 (not published).

  • apt-get install cuda-toolkit*: packages not available for this channel.

  • CUDA 12.2 on Ubuntu 24.04: compile errors (_Float64, __Float32x4_t…) even with -allow-unsupported-compiler + -ccbin g++-12.

Final Solution (Works)

Install CUDA 13.0.2 toolkit via the universal ARM64 SBSA runfile, toolkit only, keep the existing 580.95.05 driver.

CUDA Toolkit Archive with Latest Release

https://developer.nvidia.com/cuda-toolkit-archive

Steps

1) Download runfile (from the CUDA 13.0 archive/picker):

  • Platform: Linux → arm64-sbsa → Native → Ubuntu 24.04 → runfile (local)

  • Example file: cuda_13.0.2_580.95.05_linux_sbsa.run

  • Save to: ~/installers/cuda-13.0/

2) Install (interactive; UNCHECK driver)

cd ~/installers/cuda-13.0
chmod +x cuda_13.0.2_580.95.05_linux_sbsa.run
sudo sh ./cuda_13.0.2_580.95.05_linux_sbsa.run
# In the TUI:
#  - Accept EULA
#  - UNCHECK "Driver"
#  - KEEP "CUDA Toolkit" checked
#  - Install path: /usr/local/cuda-13.0

3) Environment (system-wide)

sudo tee /etc/profile.d/cuda.sh >/dev/null <<'EOF'
export PATH=/usr/local/cuda-13.0/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-13.0/lib64:${LD_LIBRARY_PATH}
EOF
sudo ln -sfn /usr/local/cuda-13.0 /usr/local/cuda
source /etc/profile.d/cuda.sh

4) Verify

which nvcc && nvcc --version
nvidia-smi

Expected: nvcc shows 13.0 (V13.0.88); nvidia-smi shows Driver 580.95.05, CUDA Version 13.0.

5) Samples (CMake route)

sudo apt-get -y install git cmake build-essential
cd ~
git clone --depth=1 -b v13.0 https://github.com/NVIDIA/cuda-samples.git
cmake -S cuda-samples -B cuda-samples/build \
  -DCMAKE_BUILD_TYPE=Release \
  -DCMAKE_CUDA_COMPILER=/usr/local/cuda-13.0/bin/nvcc
cmake --build cuda-samples/build --target deviceQuery -j"$(nproc)"
~/cuda-samples/build/Samples/1_Utilities/deviceQuery/deviceQuery | head -n 40

You should see device info; other samples like vectorAdd report Test PASSED.

Cleanup (do this to avoid future conflicts)

Remove old toolkits and stale loader entries, and strip user env overrides.

# Ensure loader points only to 13.0
echo "/usr/local/cuda-13.0/targets/sbsa-linux/lib" | sudo tee /etc/ld.so.conf.d/cuda-13-0.conf
sudo rm -f /etc/ld.so.conf.d/cuda-12-2.conf
sudo ldconfig

# Remove older toolkits (example)
sudo rm -rf /usr/local/cuda-12.2

# Comment out any 12.x PATH/LD_LIBRARY_PATH in ~/.bashrc
sed -i.bak -e '/CUDA Toolkit 12\./,+2 s/^/#/' ~/.bashrc
# New login (or source) to pick up /etc/profile.d/cuda.sh

Notes & Gotchas

  • nvidia-smi “CUDA Version” = driver runtime capability, not the installed toolkit.

  • On ARM64, the linux_sbsa runfile is distro-agnostic; if Ubuntu’s tab doesn’t show it, pick RHEL to reveal the same runfile.

  • CUDA 12.x + Ubuntu 24.04 often requires a container (22.04 base) due to GCC/glibc changes. If you must stay on 12.x, build in a container.

Minimal Command Summary

# Install CUDA 13.0 toolkit (ARM64/SBSA), toolkit-only
cd ~/installers/cuda-13.0
chmod +x cuda_13.0.2_580.95.05_linux_sbsa.run
sudo sh ./cuda_13.0.2_580.95.05_linux_sbsa.run   # UNCHECK driver; path=/usr/local/cuda-13.0

# Env
sudo tee /etc/profile.d/cuda.sh >/dev/null <<'EOF'
export PATH=/usr/local/cuda-13.0/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-13.0/lib64:${LD_LIBRARY_PATH}
EOF
sudo ln -sfn /usr/local/cuda-13.0 /usr/local/cuda
source /etc/profile.d/cuda.sh

# Verify
which nvcc && nvcc --version
nvidia-smi

# Samples
git clone --depth=1 -b v13.0 https://github.com/NVIDIA/cuda-samples.git
cmake -S cuda-samples -B cuda-samples/build -DCMAKE_CUDA_COMPILER=/usr/local/cuda-13.0/bin/nvcc
cmake --build cuda-samples/build --target deviceQuery -j"$(nproc)"
~/cuda-samples/build/Samples/1_Utilities/deviceQuery/deviceQuery | head -n 40

# Cleanup of old toolkits
sudo rm -rf /usr/local/cuda-12.2
sudo rm -f /etc/ld.so.conf.d/cuda-12-2.conf
sudo ldconfig

House Keeping

  • Exact artifact name: cuda_13.0.2_580.95.05_linux_sbsa.run

  • Where you stored it: ~/installers/cuda-13.0/

  • Driver present: 580.95.05 (kept)

  • OS / arch: Ubuntu 24.04.3 LTS, aarch64 (ARM64/SBSA)

  • Install path: /usr/local/cuda-13.0

  • Symlink: /usr/local/cuda -> /usr/local/cuda-13.0

  • Env file: /etc/profile.d/cuda.sh (PATH + LD_LIBRARY_PATH)

  • Cleanup performed: removed /usr/local/cuda-12.2, dropped /etc/ld.so.conf.d/cuda-12-2.conf, commented user-level 12.4 exports in ~/.bashrc.

Integrity Checks

After download:

sha256sum ~/installers/cuda-13.0/cuda_13.0.2_580.95.05_linux_sbsa.run
# record the hash in the post

One-shot “apply the fix” snippet (forum-friendly)

# 1) Run installer (toolkit-only)
cd ~/installers/cuda-13.0
chmod +x cuda_13.0.2_580.95.05_linux_sbsa.run
sudo sh ./cuda_13.0.2_580.95.05_linux_sbsa.run   # UNCHECK "Driver", path=/usr/local/cuda-13.0

# 2) Env + symlink
sudo tee /etc/profile.d/cuda.sh >/dev/null <<'EOF'
export PATH=/usr/local/cuda-13.0/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-13.0/lib64:${LD_LIBRARY_PATH}
EOF
sudo ln -sfn /usr/local/cuda-13.0 /usr/local/cuda
source /etc/profile.d/cuda.sh

# 3) Clean old entries (if present)
sudo rm -rf /usr/local/cuda-12.2
sudo rm -f /etc/ld.so.conf.d/cuda-12-2.conf
sudo ldconfig
sed -i.bak -e '/CUDA Toolkit 12\./,+2 s/^/#/' ~/.bashrc

Quick verification block

which nvcc && nvcc --version      # should show 13.0
nvidia-smi                        # should show driver 580.95.05, CUDA Version: 13.0

Optional: sample build line to prove it works

git clone --depth=1 -b v13.0 https://github.com/NVIDIA/cuda-samples.git
cmake -S cuda-samples -B cuda-samples/build -DCMAKE_CUDA_COMPILER=/usr/local/cuda-13.0/bin/nvcc
cmake --build cuda-samples/build --target deviceQuery -j"$(nproc)"
~/cuda-samples/build/Samples/1_Utilities/deviceQuery/deviceQuery | head -n 20
1 Like

On our DGX Spark (ARM64/SBSA) with Ubuntu 24.04, CUDA toolkit 13.0.88 was pre-installed via /etc/alternatives. We installed PyTorch 2.9.0+cu128 with CUDA support for Python projects.

Solution: PyTorch with CUDA support

Minimum step — install PyTorch with CUDA support:

# For Python projects needing CUDA at runtime

pip install torch==2.9.0+cu128 torchvision==0.24.0 torchaudio==2.9.0 \

--index-url https://download.pytorch.org/whl/cu128

Verification:

python3 -c "import torch; print(f’PyTorch: {torch._version_}'); \

print(f’CUDA available: {torch.cuda.is_available()}'); \

print(f’CUDA version: {torch.version.cuda if torch.cuda.is_available() else \“N/A\”}')"

Result:

  • PyTorch: 2.9.0+cu128

  • CUDA available: True

  • CUDA version: 12.8

  • GPU: NVIDIA GB10 (CUDA capability 12.1)

Note for sm_121 GPUs (DGX Spark GB10):

PyTorch supports sm_120/sm_121, but nvrtc doesn’t yet. Disable nvrtc if needed:

export CUDA_NVRTC=0

export NVRTC_DISABLE_NVRTC=1

# Add to ~/.bashrc for persistence

So… Summary:

  • CUDA toolkit may already be installed on DGX systems (check with which nvcc)

  • For Python projects, install PyTorch 2.9.0+cu128 for CUDA runtime support

  • Disable nvrtc for sm_121 GPUs if runtime compilation issues occur

Your manual CUDA toolkit installation solution is still needed if nvcc is missing. For Python/CUDA runtime, installing PyTorch with CUDA support is sufficient.

This was a simpler workaround till pytorch makes a Arm64 cuda patch.

Weird, CUDA 13.0.2 is preinstalled on DGX OS that comes with Spark.

2 Likes

Thanks for this input and providing a direct path. I may have completely missed the pre-installed version and just re-invented the wheel. For some reason the project I initiated identified that the CUDA was not present. Not ruling out that this could have simply been user error on my part.

1 Like

Thanks for this input and providing a direct path. Going forward, users should confirm if CUDA is already locally installed before going down the rabbit hole of the installation.

But, in the event you do need to install or re-install. The above Pitfalls roadmap will hopefully help others to save some time.

Thanks,