Build "Ovi: Twin Backbone Cross-Modal Fusion for Audio-Video Generation" on DGX Spark

Build Ovi on DGX Spark

I came across this thing Ovi: Twin Backbone Cross-Modal Fusion for Audio-Video Generation this morning and wanted to see if I could install it.

“Ovi is a veo-3 like, video+audio generation model that simultaneously generates both video and audio content from text or text+image inputs.”

I tried to follow along at Step-by-Step installation
to build on the DGX Spark, but found I had to make some slight tweaks to get it to work.

git clone https://github.com/character-ai/Ovi.git
cd Ovi

# use uv rather than virtualenv
uv venv
source .venv/bin/activate

# these are sort of cargo culty, but sometime seem needed
export TRITON_PTXAS_PATH=$(which ptxas)
export CUDA_HOME=/usr/local/cuda

# needed to add the --torch-backend flag
uv pip install torch torchvision torchaudio --torch-backend auto

# this line worked
uv pip install -r requirements.txt

# needed to add MAX_JOBS, otherwise it was using up all the CPUs
# and then running out of memory triggering to OMM killers
MAX_JOBS=4 uv pip install flash_attn --no-build-isolation

Then, I could skip to Download Weights and the rest of the steps all worked.

It’s pretty fun, it takes about 15 minutes to generate a 5 second video. So far it’s worked best for me when I give it an image to start with. When running the inference.py to generate the videos, the GPU peaked at like 50 watts and got up to like 150 F iirc.

2 Likes

Is the MAX_JOBS=4 fix solely for uv or can you have the same problem with a “normal” pip install? I ask because I have had long installs crash the DGX at completion, even though the install worked.

It’s not uv related — got the fix off the GitHub issues for the flash-attn thing.

I used to have that issue a long time ago on a shared 36 CPU Sun Sparc that iirc had 96 Gb RAM — make or cmake or something like that will use all CPUs it can find and then end up eating all the RAM, then swap, the OOM starts shooting random processes.

ETA: my temperature must be too high because I’m hallucinating— Solaris had no OOM killer — but it would take down the box and end up failing if you didn’t limit the number of CPUs a build used. Also had to nice and ionice the heck out of every build scrips.

1 Like