Can't compile with cuda support

Trying to compile whisper.cpp with cuda support in Jetson Orin AGX 64gb following the instructions of their github page https://github.com/ggerganov/whisper.cpp?tab=readme-ov-file#nvidia-gpu-support I’m getting the folowing error:

WHISPER_CUBLAS=1 make -j 
I whisper.cpp build info: 
I UNAME_S:  Linux
I UNAME_P:  aarch64
I UNAME_M:  aarch64
I CFLAGS:   -I.              -O3 -DNDEBUG -std=c11   -fPIC -D_XOPEN_SOURCE=600 -D_GNU_SOURCE -pthread -DGGML_USE_CUBLAS -I/usr/local/cuda/include -I/opt/cuda/include -I/targets/aarch64-linux/include
I CXXFLAGS: -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -D_XOPEN_SOURCE=600 -D_GNU_SOURCE -pthread -DGGML_USE_CUBLAS -I/usr/local/cuda/include -I/opt/cuda/include -I/targets/aarch64-linux/include
I LDFLAGS:   -lcuda -lcublas -lculibos -lcudart -lcublasLt -lpthread -ldl -lrt -L/usr/local/cuda/lib64 -L/opt/cuda/lib64 -L/targets/aarch64-linux/lib -L/usr/lib/wsl/lib
I CC:       cc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0
I CXX:      g++ (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0

nvcc --forward-unknown-to-host-compiler -arch=all -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -D_XOPEN_SOURCE=600 -D_GNU_SOURCE -pthread -DGGML_USE_CUBLAS -I/usr/local/cuda/include -I/opt/cuda/include -I/targets/aarch64-linux/include -Wno-pedantic -c ggml-cuda.cu -o ggml-cuda.o
nvcc fatal   : Value 'all' is not defined for option 'gpu-architecture'
make: *** [Makefile:233: ggml-cuda.o] Error 1

I have the whole jetpack installed (including CUDA) and running the default Ubuntu system that came already installed at the device.

Hi @paelnever, try setting CUDA_ARCH_FLAG=sm_87 as either environment variable or in the Makefile:

Or try cmake and change GGML_CUDA_ARCHITECTURES. I haven’t tried building whisper.cpp, but I have dockerfiles for a few different versions of Whisper here:

1 Like

Ok that edit worked to solve that single trouble, then i had to edit a couple more lines to achieve successful compile of whisper.cpp which was first step to compile talk-llama.
Trying to compile talk-llama which was my truly objective had a little trouble that solved by installing some packages following instructions here Pygame on Jetson nano - #9 by user38008
so just run sudo apt-get install libsdl2-ttf-dev libsdl2-image-dev libsdl2-mixer-dev
thx for the help. Hope i can test your dokerfiles in the future, i’m trying to find the quickest voice recognition and TTS quality models to make a voice personal assistant with orin AGX

Have you tried Riva? It has very fast and efficient ASR/TTS, and uses transformer-based models so the quality is good:

This is what I use in my llamaspeak videos. Riva isn’t out yet for JetPack 6 (it will be soon), so currently that only runs on JetPack 5. And if you find whisper.cpp to be faster/better for streaming ASR, that would be good to know thanks!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.