Ollama 0.4.2 released and runs on Nvidia Jetson Orin AGX 64

nav-intel · November 17, 2024, 6:24pm

If like me you have struggled to get Ollama running natively (without docker) on Jetson Orin AGX 64gb Jetpack 6.1 you will be pleased to learn that Ollama 0.4.2 has just been released and it runs fast. It detects the platform correctly and installs Jetpack 6 components as below.

curl -fsSL https://ollama.com/install.sh | sh

Installing ollama to /usr/local
Downloading Linux arm64 bundle
######################################################################## 100.0%
Downloading JetPack 6 components
######################################################################## 100.0%
Adding ollama user to render group…
Adding ollama user to video group…
Adding current user to ollama group…
Creating ollama systemd service…
Enabling and starting ollama service…
NVIDIA JetPack ready.
The Ollama API is now available at 127.0.0.1:11434.
Install complete. Run “ollama” from the command line.
Llama3.2:latest loads in about 3 seconds and runs fast. Llama 3.2-vision also works acceptably fast but I get a System throttling error due to over-current as the model causes the Orin AGX 64 to draw 46 watts briefly.

carolyuu · November 17, 2024, 6:30pm

Hi,
Here are some suggestions for the common issues:

1. Performance

Please run the below command before benchmarking deep learning use case:

$ sudo nvpmodel -m 0
$ sudo jetson_clocks

2. Installation

Installation guide of deep learning frameworks on Jetson:

TensorFlow: Installing TensorFlow for Jetson Platform - NVIDIA Docs
PyTorch: Installing PyTorch for Jetson Platform - NVIDIA Docs
We also have containers that have frameworks preinstalled:
Data Science, Machine Learning, AI, HPC Containers | NVIDIA NGC

3. Tutorial

Startup deep learning tutorial:

Jetson-inference: Hello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson
TensorRT sample: Jetson/L4T/TRT Customized Example - eLinux.org

4. Report issue

If these suggestions don’t help and you want to report an issue to us, please attach the model, command/step, and the customized app (if any) with us to reproduce locally.

Thanks!

nav-intel · November 17, 2024, 8:54pm

Hi @carolyuu, Thank you for the quick reply, I have tried both sudo nvpmodel -m 0 and sudo jetson_clocks. singly and both at the same time. But it makes no difference. Interestingly when running the Llama3.2-Vision Models the wattage peaks at 46.x watts briefly which triggers the Overcurrent alert. Its only the Llama3.2-Vision Models that trigger the throttling. Running the bigger Nemotron70B model does NOT trigger an alert.

Curiously the Overcurrent trigger is listed as 45 watts in the Documentation for the ORIN AGX 32Gb and 60 watts for the ORIN AGX 64Gb which is the model I have. It is as if the 64Gb machine has somehow acquired the Power Configuration for the 32Gb machine? Perhaps it needs resetting though I don’t know how to do this.

That said I am very impressed with the Jetson range its perfect for my product development and I really must get round to building a company website so I can join the Nvidia inception program.
Hillary

AastaLLL · November 18, 2024, 3:02am

Hi,

Thanks for sharing the experience.
Suppose your issue is fixed after using the Ollama 4.2, is that correct?

There are multiple events that can trigger overcurrent. You can find the details in the below doc:

https://docs.nvidia.com/jetson/archives/r36.4/DeveloperGuide/SD/PlatformPowerAndPerformance/JetsonOrinNanoSeriesJetsonOrinNxSeriesAndJetsonAgxOrinSeries.html#overcurrent-event-status

Thanks.

nav-intel · November 18, 2024, 12:16pm

Thanks for the link to the Power information, Certainly the issue with downloading and compiling a version of Ollama that uses the Jetson Orin AGX GPU correctly is fixed with the release of 0.4.2 (note I have corrected the version to Ollama 0.4.2 in my original post). The standard download script for Linux on the Ollama website works as expected. However Ollama 0.4.1 did NOT work, although it detected the Jetson GPU and apparently installed the correct libs, It would never run a model of any size and after 5 minutes would time out. But Ollama 0.4.2 works fine

AastaLLL · November 20, 2024, 6:53am

Thanks, this info will also help other users.
Would you mind updating the topic title for the correct Ollama version as well?

nav-intel · November 20, 2024, 11:31am

I have tried to edit the title but the platform won’t let me edit that part anymore. The edit button lets me edit the latest post but not earlier ones in a thread. I could delete it and start again, unless you know a way to edit it?

AastaLLL · November 21, 2024, 6:35am

Hi,

We edit the title accordingly.
Thanks for your feedback.

nav-intel · November 21, 2024, 1:57pm

No problem, glad to help

system · December 18, 2024, 4:09am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Ollama and Jetson issue Jetson Orin NX jetson-inference , generative_ai	12	5352	March 20, 2024
Ollama on Docker does not finmd GPU Jetson Orin Nano generative_ai	4	673	March 5, 2025
@Dusty_nv has anyone managed to get Ollama running with llama3.2-vision yet? Jetson AGX Orin cuda , generative_ai , llama	7	401	December 28, 2024
Running Ollama / llama3.1 on Jetson AGX Xavier 16gb is it possible? how-to? Jetson AGX Xavier generative_ai , llama-31-8b-instruct	8	1962	October 19, 2024
Ollama unable to detect gpu on JetPack 6.1 Jetson AGX Orin generative_ai	7	658	October 15, 2024
Jetpack6.2+TensorRT OOM issue Jetson Orin Nano generative_ai , llama	7	150	February 21, 2025
Introducing Ollama Support for Jetson Devices Jetson Projects cuda , natural-language-processing-nlp , artificialintelligence , interactive , docker-machine-learning , generative_ai	29	11454	August 28, 2024
Running llama3.3 or llama4 on Jetson AGX Orin Developer Kit (64 GB) Jetson AGX Orin generative_ai	7	121	May 12, 2025
Ollama timing out when attempting to use GPU instead of CPU Jetson AGX Orin cuda , jetson-inference , generative_ai	9	4341	August 27, 2024
Offline cv0, cv1 and cv2 Jetson Orin Nano generative_ai	51	424	March 3, 2025

Ollama 0.4.2 released and runs on Nvidia Jetson Orin AGX 64

1. Performance

2. Installation

3. Tutorial

4. Report issue

Related topics