Unable to Utilize GPU for LLM on NVIDIA Jetson AGX Orin

I am trying to run an LLM using CUDA on my NVIDIA Jetson AGX Orin but the model only utilizes the CPU, not the GPU.
While loading the LLM , i am using llama_cpp and i have specified “n_ctx=8192, n_gpu_layers=2, ntcx=2048,use_gpu= True” while loading the LLM.

Request :Any guidance on how to ensure the LLM utilizes the GPU would be greatly appreciated.

Duplicated with Unable to Utilize GPU for LLM on NVIDIA Jetson AGX Orin - Jetson & Embedded Systems / Jetson AGX Orin - NVIDIA Developer Forums

I am looking to utilize the GPU of my Nvidia Jetson AGX Orin to load a Large Language Model (LLM). I’m not limited to using llama_cpp; any other method or library that can achieve this is fine with me.

Please find the suggestion in the above topic instead.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.