Build "Ovi: Twin Backbone Cross-Modal Fusion for Audio-Video Generation" on DGX Spark

avrami · November 1, 2025, 8:29pm

Build Ovi on DGX Spark

I came across this thing Ovi: Twin Backbone Cross-Modal Fusion for Audio-Video Generation this morning and wanted to see if I could install it.

“Ovi is a veo-3 like, video+audio generation model that simultaneously generates both video and audio content from text or text+image inputs.”

I tried to follow along at Step-by-Step installation
to build on the DGX Spark, but found I had to make some slight tweaks to get it to work.

git clone https://github.com/character-ai/Ovi.git
cd Ovi

# use uv rather than virtualenv
uv venv
source .venv/bin/activate

# these are sort of cargo culty, but sometime seem needed
export TRITON_PTXAS_PATH=$(which ptxas)
export CUDA_HOME=/usr/local/cuda

# needed to add the --torch-backend flag
uv pip install torch torchvision torchaudio --torch-backend auto

# this line worked
uv pip install -r requirements.txt

# needed to add MAX_JOBS, otherwise it was using up all the CPUs
# and then running out of memory triggering to OMM killers
MAX_JOBS=4 uv pip install flash_attn --no-build-isolation

Then, I could skip to Download Weights and the rest of the steps all worked.

It’s pretty fun, it takes about 15 minutes to generate a 5 second video. So far it’s worked best for me when I give it an image to start with. When running the inference.py to generate the videos, the GPU peaked at like 50 watts and got up to like 150 F iirc.

joey28 · November 1, 2025, 9:14pm

Is the MAX_JOBS=4 fix solely for uv or can you have the same problem with a “normal” pip install? I ask because I have had long installs crash the DGX at completion, even though the install worked.

avrami · November 1, 2025, 9:35pm

It’s not uv related — got the fix off the GitHub issues for the flash-attn thing.

I used to have that issue a long time ago on a shared 36 CPU Sun Sparc that iirc had 96 Gb RAM — make or cmake or something like that will use all CPUs it can find and then end up eating all the RAM, then swap, the OOM starts shooting random processes.

ETA: my temperature must be too high because I’m hallucinating— Solaris had no OOM killer — but it would take down the box and end up failing if you didn’t limit the number of CPUs a build used. Also had to nice and ionice the heck out of every build scrips.

Topic		Replies	Views
[DGX Spark] VibeVoice TTS + Streaming Voice Pipeline - Setup Guide DGX Spark / GB10 Projects cuda	0	550	January 4, 2026
Running vLLM-Omni for Qwen3-TTS(voice design, voice clone) on DGX Spark DGX Spark / GB10 Projects	7	484	February 27, 2026
xTTS in a Dockercontainer on the DGX Spark DGX Spark / GB10 Projects docker	3	312	February 12, 2026
Ostris' AI Toolkit on DGX Spark DGX Spark / GB10 Projects	11	592	January 17, 2026
Effective PyTorch and CUDA DGX Spark / GB10 cudnn	23	8432	January 12, 2026
Having trouble with my dgx spark-digits DGX Spark / GB10	3	97	December 5, 2025
Fine-tuning FLUX.1-dev (Dockerized SimpleTuner) on DGX Spark DGX Spark / GB10 Projects	10	475	January 3, 2026
Has anyone been able to get Ostris' AI Toolkit running on DGX Spark? DGX Spark / GB10	22	2390	December 19, 2025
Support for Qwen3-TTS on DGX Spark (GB10) \| torchaudio installation failure on ARM64 DGX Spark / GB10 Projects pytorch	4	416	February 17, 2026
Impressions on DGX Spark after a day's use DGX Spark / GB10	6	1223	October 23, 2025

Build "Ovi: Twin Backbone Cross-Modal Fusion for Audio-Video Generation" on DGX Spark

Build Ovi on DGX Spark

Related topics