Build "Step-Audio-EditX Technical Report" On DGX Spark

avrami · November 9, 2025, 12:42am

I saw a paper this morning on hugging face “Step-Audio-EditX Technical Report” and I wanted to try to run the code according to the directions on the GitHub page GitHub - stepfun-ai/Step-Audio-EditX: A powerful 3B-parameter, LLM-based Reinforcement Learning audio edit model excels at editing emotion, speaking style, and paralinguistics, and features robust zero-shot text-to-speech

I ran into one issue because a wheel was not available from Microsoft.

  × No solution found when resolving dependencies:
  ╰─▶ Because onnxruntime-gpu==1.17.0 has no wheels with a matching platform tag (e.g., `manylinux_2_39_aarch64`) and you
      require onnxruntime-gpu==1.17.0, we can conclude that your requirements are unsatisfiable.

      hint: Wheels are available for `onnxruntime-gpu` (v1.17.0) on the following platforms: `manylinux_2_28_x86_64`,
      `win_amd64`

Dependency #1 – `CUDNN_HOME`

I found I need to set a CUDNN_HOME – I’m not sure of the best way to do this, but this is what I did.

mkdir -p ~/cudnn/
cd ~/cudnn/
wget https://developer.download.nvidia.com/compute/cudnn/redist/cudnn/linux-aarch64/cudnn-linux-aarch64-9.13.1.26_cuda13-archive.tar.xz
tar -xf cudnn-linux-aarch64-9.13.1.26_cuda13-archive.tar.xz
export CUDNN_HOME=~/cudnn/cudnn-linux-aarch64-9.13.1.26_cuda13-archive

There has got to be a better way? Or is the weird python package of it from NVIDIA the best way, and I just can’t figure out how to make ONNX’s build.sh to use that package?

This CUDNN_HOME business is needed for ONNX.

Dependency #2 – ONNX Runtime

This was an adventure to get to build, as there I could not figure out how to get a compatible wheel from Microsoft for this, so I had to build the wheel from source.

I set my cargo cult build environment.

export TORCH_CUDA_ARCH_LIST=12.1a 
export TRITON_PTXAS_PATH=$(which ptxas)
export CUDA_HOME=/usr/local/cuda
export UV_TORCH_BACKEND=auto
export MAX_JOBS=4

I found directions to build on DGX Spark on a comment in a GitHub issue Pip cannot find package for Nvidia DGX Spark (arm linux) · Issue #26351 · microsoft/onnxruntime · GitHub

uv init venv-build-onnxruntime
source ./venv-build-onnxruntime/bin/activate
git clone https://github.com/microsoft/onnxruntime
cd onnxruntime
uv pip install cmake ninja packaging numpy setuptools
sh build.sh --config Release  --build_dir build/cuda13 --parallel 4 --nvcc_threads 1 --use_cuda \
            --cuda_version 13.0 --cuda_home  $CUDA_HOME  \
            --cudnn_home $CUDNN_HOME \
            --build_wheel --skip_tests \
            --cmake_generator Ninja \
            --use_binskim_compliant_compile_flags \
            --cmake_extra_defines CMAKE_CUDA_ARCHITECTURES=121 onnxruntime_BUILD_UNIT_TESTS=OFF
mkdir -p ~/wheels
cp ./build/cuda13/Release/dist/onnxruntime_gpu-1.24.0-cp312-cp312-linux_aarch64.whl ~/wheels
deactivate

Now there is a wheel built that should work on the DGX spark.

Dependency #3 – `ffmpeg`

sudo apt install ffmpeg

This is needed for audio file processing in the demo app.

Build Steps

Now, things are ready to follow along from the GitHub directions almost.

Where their GitHub says:

git clone https://github.com/stepfun-ai/Step-Audio-EditX.git
conda create -n stepaudioedit python=3.10
conda activate stepaudioedit

cd Step-Audio-EditX
pip install -r requirements.txt

git lfs install
git clone https://huggingface.co/stepfun-ai/Step-Audio-Tokenizer
git clone https://huggingface.co/stepfun-ai/Step-Audio-EditX

I did the following:

uv venv venv-step-audio
source ./venv-step-audio/bin/activate
uv pip install ~/wheels/onnxruntime_gpu-1.24.0-cp312-cp312-linux_aarch64.whl
git clone https://github.com/stepfun-ai/Step-Audio-EditX.git
cd Step-Audio-EditX

Then I had to comment out line 7 in requirements.txt

diff --git a/requirements.txt b/requirements.txt
index de402f1..b8fa290 100644
--- a/requirements.txt
+++ b/requirements.txt
@@ -6,3 +6,3 @@ accelerate==1.3.0
 openai-whisper==20240930
-onnxruntime-gpu==1.17.0
+#onnxruntime-gpu==1.17.0
 onnxruntime

and I was able to finish up that step, and the rest of what I’ve tried below has been working.

uv pip install -r requirements.txt

git lfs install
git clone https://huggingface.co/stepfun-ai/Step-Audio-Tokenizer
git clone https://huggingface.co/stepfun-ai/Step-Audio-EditX

It runs a little demo server on port :7860 that lets you do one of two things to an uploaded voice recording. You can “Clone” the voice and make it say anything you want. The other mode is an “Edit” mode where you can clean up the voice and edit the emotion. You can also do para-linguistic editing, but I haven’t figured out that one in the UX yet.

vikcious · November 16, 2025, 5:15pm

pip install onnxruntime-gpu==1.17.0 --extra-index-url https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/onnxruntime-cuda-12/pypi/simple/

avrami · November 18, 2025, 11:11pm

that gives me:

uv pip install onnxruntime-gpu==1.17.0 --extra-index-url https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/onnxruntime-cuda-12/pypi/simple/
Using Python 3.12.12 environment at: venv-onnx2
  × No solution found when resolving dependencies:
  ╰─▶ Because onnxruntime-gpu==1.17.0 has no wheels with a matching platform tag (e.g., `manylinux_2_39_aarch64`) and you
      require onnxruntime-gpu==1.17.0, we can conclude that your requirements are unsatisfiable.

      hint: `onnxruntime-gpu` was found on
      https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/onnxruntime-cuda-12/pypi/simple/, but not at
      the requested version (onnxruntime-gpu==1.17.0). A compatible version may be available on a subsequent index
      (e.g., https://pypi.org/simple). By default, uv will only consider versions that are published on the first index
      that contains a given package, to avoid dependency confusion attacks. If all indexes are equally trusted, use
      `--index-strategy unsafe-best-match` to consider all versions from all indexes, regardless of the order in which they
      were defined.

      hint: Wheels are available for `onnxruntime-gpu` (v1.17.0) on the following platforms: `manylinux_2_28_x86_64`,
      `win_amd64`

Also, /usr/local/cuda is CUDA 13 as far as I know?

ETA: with plain pip (ie python -m venv venv) :

pip install onnxruntime-gpu==1.17.0 --extra-index-url https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/onnxruntime-cuda-12/pypi/simple/
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com, https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/onnxruntime-cuda-12/pypi/simple/
ERROR: Could not find a version that satisfies the requirement onnxruntime-gpu==1.17.0 (from versions: none)
ERROR: No matching distribution found for onnxruntime-gpu==1.17.0

Topic		Replies	Views
Wan2GP: ERROR: No matching distribution found for onnxruntime-gpu==1.22 DGX Spark / GB10	3	494	November 9, 2025
ONNXRUNTIME_GPU install or build fails Jetson Xavier NX onnx	4	2120	October 13, 2022
Install torchaudio and torchvison from wheel Jetson AGX Orin pytorch	4	1769	December 21, 2022
Install Onnxruntime with JetPack 4.4 on AGX Jetson AGX Xavier onnx	3	1992	October 18, 2021
No onnxruntime_gpu in sbsa/cu130 Jetson Thor onnx	4	414	October 13, 2025
Request: torchaudio / torchvision Wheel for JetPack 6.2.1 + torch 2.5.0a0 (Jetson AGX Orin) Jetson Orin NX pytorch	5	350	August 15, 2025
Installing ONNX library on my Jetson Xavier Jetson AGX Xavier	13	6585	October 18, 2021
Build ONNXInference-gpu wheel for Jetpack5 with Cuda and TRT Jetson AGX Orin tensorrt , cuda , onnx	6	2888	August 10, 2022
How to build onnxruntime on Xavier NX Jetson Xavier NX tensorrt , onnx	2	3735	October 18, 2021
Getting CUDA acceleration for ONNX on Jetson Orin Nano with JetPack 6.2 Jetson Orin Nano python , cudnn , onnx	7	1159	October 23, 2025

Build "Step-Audio-EditX Technical Report" On DGX Spark

Dependency #1 – CUDNN_HOME

Dependency #2 – ONNX Runtime

Dependency #3 – ffmpeg

Build Steps

Related topics

Dependency #1 – `CUDNN_HOME`

Dependency #3 – `ffmpeg`