CUDA 13.2 DGX Spark impact

My recap what CUDA 13.2 brings that matters for DB10:

The Big Wins for DGX Spark / SM121

cuBLASLt: NVFP4 and MXFP8 performance improvements on DGX Spark. This is the headline item — cuBLASLt now delivers up to 3× performance improvement for NVFP4 and MXFP8 data types on DGX Spark systems for large M and N problem sizes. Also, cuBLASLt’s experimental Grouped GEMM API now supports MXFP8 inputs on GPUs with Compute Capability 10.x and 11.0. NVIDIA

Critical bug fix: A cublasLtMatmul issue that could lead to incorrect results when running concurrently with another kernel that uses Tensor Memory has been fixed NVIDIA — this affected Compute Capability 10.x and 11.x since cuBLAS 12.8. Could be related to quality degradation people were seeing.

CUDA Tile — Now on SM120/SM121

CUDA Tile is now supported on compute capability 8.X (Ampere and Ada), as well as 10.X and 12.X architectures (Blackwell). NVIDIA Developer This is the new tile-based programming model NVIDIA introduced in 13.0. cuTile Python (the Python DSL) now supports recursive functions, closures, custom reductions, and enhanced array slicing. This could eventually become the cleaner path to writing optimized NVFP4 kernels for SM121 vs the current CUTLASS patch-and-pray approach.

Unified Tegra + Desktop Toolkit

CUDA 13.2 delivers a single unified toolkit for Tegra and desktop GPUs, reducing overhead for containers and libraries. NVIDIA This is relevant for DGX Spark since GB10 is an aarch64 Tegra-derived SoC — fewer divergences between the Tegra and desktop CUDA paths means less chance of hitting SM121-specific bugs that only appear on the Spark.

Other Notable Items

PTX ISA 9.2 — new PTX features, worth checking if there are any SM121-specific instruction improvements.

Compiler: support for new host compilers including VS 2026, plus improved nvcc host compilation support on aarch64 systems, including fixes for ARM Neon intrinsics when using newer GCC versions.

CUDA_DISABLE_PERF_BOOST env var added — lets you disable GPU power state boosting, useful for power management in your rack enclosure project.

Is it worth upgrading to CUDA 13.2?
What do you guys think?

7 Likes

Definitely look forward to upgrading - once it arrives in the official system updates.

2 Likes

Dear @aniculescu /NVIDIA,
I wonder, when we can use or install CUDA 13.2 CUDA Toolkit 13.2 - Release Notes — Release Notes 13.2 documentation or our GB10 (e.g., DGX Spark) devices will be upgraded automatically to this CUDA 13.2 version, because this version contains multiple DGX Spark improvements?
Many thanks!

4 Likes

I hope this gets released very soon. Here is today’s latest update: nvcc --version

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2025 NVIDIA Corporation
Built on Wed_Aug_20_01:57:39_PM_PDT_2025
Cuda compilation tools, release 13.0, V13.0.88
Build cuda_13.0.r13.0/compiler.36424714_0

Just saying “hi” while waiting for everyone else for the official GB10 update with CUDA 13.2!

3 Likes

The community docker is using nvidia/cuda:13.2.0-devel-ubuntu24.04 as a base now.

3 Likes

What if you run something outside a docker container? Any official way to update to 13.2 on the Spark?

The “official” and safe way is to wait until it gets approved for the use with the DGX Spark/GB10 platform. Otherwise you might end up with a broken system or at least possibly an impaired system.

If you’re comfortable working with Linux systems, you can install “bleeding edge” drivers, provided you know how to quickly roll back the old driver even without a working console (fallback via SSH).

Nevertheless, you should always have a backup on hand. ;-)

Nah, I may feel comfortable still I don’t wanna risk everything I have on my Sparks. They are too precious. :)