Compile llama.cpp to use the Jetson Orin Nano Super GPU

kegintheai · September 1, 2025, 5:35pm

Hi. I’m trying to compile llama.cpp to detect and use the GPU of the Jetson Orin Nano Super. I’ve tried all the instructions and tricks posted online including the instructions on the llama.cpp Github to compile for CUDA. Unfortunately, all my attempts failed. I still get the message that NO GPU IS DETECTED and –ngl 99 is ignored when I launch llama-cli or llama-server. I have also tried it using the Jetson Container for llama.cpp and that failed too.

I would truly appreciate any guidance or pointers to make it work. Thanks!

carolyuu · September 1, 2025, 5:48pm

*** Please note that this reply is generated by LLM automatically ***

I’d be happy to help you troubleshoot the issue with compiling llamacpp to detect and use the GPU of the Jetson Orin Nano.

Firstly, let’s ensure that your Jetson Orin Nano is properly set up and configured. The Jetson Orin Nano is a production module that requires a carrier board designed or procured for your end product. Have you attached the module to a carrier board and flashed it with the software image you have developed?

Assuming that your Jetson Orin Nano is properly set up, let’s move on to the compilation of llamacpp. The instructions on the llamacpp Github provide guidance on compiling for CUDA, but it’s possible that there are some specific requirements or configurations that need to be met for the Jetson Orin Nano.

Here are a few potential issues that might be causing the “NO GPU IS DETECTED” message:

CUDA version: The Jetson Orin Nano uses CUDA 11.4, which might be different from the version used in the llamacpp instructions. Ensure that you are using the correct CUDA version and that it is properly installed on your system.
GPU architecture: The Jetson Orin Nano has a Ampere GPU architecture, which might require specific configurations or flags during compilation. Check the llamacpp documentation to see if there are any specific requirements for Ampere GPUs.
Device detection: The Jetson Orin Nano might not be properly detected by the llamacpp compilation process. You can try using the nvcc compiler to detect the GPU and ensure that it is properly recognized.
Driver version: The Jetson Orin Nano requires a specific driver version to function properly. Ensure that you are using the correct driver version and that it is properly installed on your system.

To troubleshoot the issue, you can try the following:

Check the CUDA version: Run nvcc --version to check the CUDA version installed on your system.
Check the GPU architecture: Run nvidia-smi to check the GPU architecture and ensure that it is properly detected.
Check the device detection: Run nvcc -o detect_gpu detect_gpu.cu to detect the GPU and ensure that it is properly recognized.
Check the driver version: Run nvidia-smi to check the driver version and ensure that it is properly installed.

If none of these troubleshooting steps resolve the issue, you can try using the Jetson Container for llamacpp, which is specifically designed for the Jetson platform. However, since you mentioned that you have already tried this approach and it failed, we might need to dig deeper into the issue.

In comparison to other Jetson devices, the Jetson Orin Nano has a unique GPU architecture and requires specific configurations. The Jetson AGX Orin, for example, has a more powerful GPU and might require different configurations or flags during compilation.

If you provide more details about your compilation process and the errors you are encountering, I can try to provide more specific guidance or pointers to help you resolve the issue.

*** Please give the thumbs up if you get answers you like. Or provide feedback to help us improve the answer. ***

kegintheai · September 2, 2025, 3:36am

Unfortunately this info didn’t help and some of it is outdated e.g. my Jetson Orin Nano is using CUDA 12.6

AastaLLL · September 2, 2025, 6:16am

Hi,

Could you try if the below command can work?

git clone https://github.com/ggml-org/llama.cpp
cd llama.cpp && cmake -B build \
        -DGGML_CUDA=ON \
        -DGGML_CUDA_F16=on \
        -DLLAMA_CURL=on \
        -DGGML_CUDA_FA_ALL_QUANTS=ON  \
        -DCMAKE_CUDA_ARCHITECTURES="87"
cmake --build build --config Release --parallel 8

Thanks.

kegintheai · September 2, 2025, 8:23am

@AastaLLL Hi. It worked! Thank you so much for your amazing help. I really appreciate it!

I’m so sorry for my earlier inaccurate feedback. I had a clash of nvcc versions which seemed to have caused a cmake crash.

ggml_cuda_init: found 1 CUDA devices:
Device 0: Orin, compute capability 8.7, VMM: yes

Once again, thank you so much!

AastaLLL · September 3, 2025, 9:48am

Hi,

Could you try to set the two environment variables below and re-build it again?

export PATH=/usr/local/cuda-12.6/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-12.6/lib64:$LD_LIBRARY_PATH

Thanks.

kegintheai · September 3, 2025, 11:47am

@AastaLLL Hi, sure will do. Thank you so much!

Topic		Replies	Views
Installing llama.cpp Jetson Orin NX cuda , llama	4	1225	December 12, 2025
Llamacpp compile failed on Jetson Orin Nano (8GB) Jetson Orin Nano generative_ai , llama	4	893	December 30, 2024
Jetpack6 llamacpppython Jetson AGX Orin generative_ai , llama	4	1099	January 3, 2025
LLM not working getting error Jetson Orin Nano llm	2	83	March 18, 2026
Buildling Llama.cpp on Jetpack 7.1 Jetson Thor llama	1	166	April 7, 2026
Unable to Utilize GPU for LLM on NVIDIA Jetson AGX Orin Jetson AGX Orin generative_ai	3	414	July 4, 2024
Setting Up the NVIDIA Jetson Orin Nano Super Dev Kit on JetPack 7.2 — A Practical Guide (June 2026) Jetson Orin Nano cudnn , jetson , stable-diffusion-xl , llama	11	1812	June 16, 2026
No Existing base Libraries for Python/Cuda? Jetson Orin Nano python , generative_ai	1	223	June 12, 2024
Ollama errors orin nano Jetson Orin NX nvbugs , generative_ai	42	3069	February 12, 2026
Ollama C++ or Python Example Jetson Orin Nano generative_ai , deepseek	4	531	August 27, 2025

Compile llama.cpp to use the Jetson Orin Nano Super GPU

Related topics