Jetpack6 llamacpppython

suhyun01150 · January 3, 2025, 2:24am

https://pypi.jetson-ai-lab.dev/jp6/cu126/llama-cpp-python/0.3.1

I installed the llama-cpp-python package from this source on my Jetson device. However, it appears that GPU support is not working as expected.

Here’s the code I used:
from llama_cpp import Llama

llama = Llama(“path.gguf”, num_gpu=-1, verbose=True)

And here’s the output I received:
llm_load_tensors: offloading 0 repeating layers to GPU
llm_load_tensors: offloaded 0/29 layers to GPU

Despite specifying num_gpu=-1, none of the layers were offloaded to the GPU. My setup includes CUDA 12.6, and the device is a Jetson Orin with Compute Capability 8.7. Could you help me understand why GPU support is not functioning and provide guidance to resolve this issue?

Thank you in advance for your assistance!

carolyuu · January 3, 2025, 2:30am

Hi,
Here are some suggestions for the common issues:

1. Performance

Please run the below command before benchmarking deep learning use case:

$ sudo nvpmodel -m 0
$ sudo jetson_clocks

2. Installation

Installation guide of deep learning frameworks on Jetson:

TensorFlow: Installing TensorFlow for Jetson Platform - NVIDIA Docs
PyTorch: Installing PyTorch for Jetson Platform - NVIDIA Docs
We also have containers that have frameworks preinstalled:
Data Science, Machine Learning, AI, HPC Containers | NVIDIA NGC

3. Tutorial

Startup deep learning tutorial:

Jetson-inference: Hello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson
TensorRT sample: Jetson/L4T/TRT Customized Example - eLinux.org

4. Report issue

If these suggestions don’t help and you want to report an issue to us, please attach the model, command/step, and the customized app (if any) with us to reproduce locally.

Thanks!

suhyun01150 · January 3, 2025, 2:37am

Thank you for the suggestions, but my issue seems to be unrelated to general performance settings or the installation of deep learning frameworks.

I am specifically working with the llama-cpp-python package on a Jetson Orin device with CUDA 12.6. Despite specifying num_gpu=-1 in my code, none of the layers are being offloaded to the GPU.

and this is my jetpack version:
jetson_release

Software part of jetson-stats 4.2.12 - (c) 2024, Raffaello Bonghi
Model: NVIDIA Jetson AGX Orin Developer Kit - Jetpack 6.1 [L4T 36.4.0]
NV Power Mode[0]: MAXN
Serial Number: [XXX Show with: jetson_release -s XXX]
Hardware:

P-Number: p3701-0005
Module: NVIDIA Jetson AGX Orin (64GB ram)
Platform:
Distribution: Ubuntu 22.04 Jammy Jellyfish
Release: 5.15.148-tegra
jtop:
Version: 4.2.12
Service: Active
Libraries:
CUDA: 12.6.68
cuDNN: 9.3.0.75
TensorRT: 10.3.0.30
VPI: 3.2.4
Vulkan: 1.3.204
OpenCV: 4.8.0 - with CUDA: NO

suhyun01150 · January 3, 2025, 5:00am

hello…could you please help me?

AastaLLL · January 3, 2025, 6:43am

Hi,

Based on the readme in the llama-cpp-python, please try n_gpu_layers=-1 to use use GPU acceleration:

Thanks.

system · January 28, 2025, 4:51am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Jetpack6.2+TensorRT OOM issue Jetson Orin Nano generative_ai , llama	7	165	February 21, 2025
Unable to Utilize GPU for LLM on NVIDIA Jetson AGX Orin Jetson AGX Orin generative_ai	4	216	July 4, 2024
Unable to Utilize GPU for LLM on NVIDIA Jetson AGX Orin Jetson AGX Orin generative_ai	4	222	July 4, 2024
Jetson orin nano local small models perform insanely slow Jetson Orin Nano generative_ai	2	629	June 6, 2024
Ollama timing out when attempting to use GPU instead of CPU Jetson AGX Orin cuda , jetson-inference , generative_ai	9	4503	August 27, 2024
Running Llama3.1 on JP5.1 Jetson AGX Orin generative_ai , llama	6	156	January 10, 2025
Running llama3.3 or llama4 on Jetson AGX Orin Developer Kit (64 GB) Jetson AGX Orin generative_ai	8	148	May 12, 2025
Ollama and Jetson issue Jetson Orin NX jetson-inference , generative_ai	12	5391	March 20, 2024
Trouble running Llamaspeak on AGX Orin 64GB Jetson AGX Orin demos-and-tutorials , generative_ai	8	490	May 25, 2024
Issue to install pytorch on Jetson Orin platform Jetson Orin NX jetson-inference , pytorch	10	5951	June 7, 2023

Jetpack6 llamacpppython

1. Performance

2. Installation

3. Tutorial

4. Report issue

Related topics