Updating Orin Nano breaks Ollama

pdobrien3 · October 12, 2025, 11:37am

First, my apologies if I didn’t pick the right area to post. I run an Ollama docker container that I use for both Home Assistant and Frigate. I recently updated my Orin Nano using apt and now I am getting all kinds of out of memory errors.

Here is what was updated:

(nvidia-l4t-weston:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), containerd.io:arm64 (1.7.28-0~ubuntu.22.04~jammy, 1.7.28-1~ubuntu.22.04~jammy), docker-compose-plugin:arm64 (2.39.4-0~ubuntu.22.04~jammy, 2.40.0-1~ubuntu.22.04~jammy), nvidia-l4t-vulkan-sc-samples:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), docker-ce-cli:arm64 (5:28.5.0-1~ubuntu.22.04~jammy, 5:28.5.1-1~ubuntu.22.04~jammy), nvidia-l4t-vulkan-sc-sdk:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-firmware:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), wpasupplicant:arm64 (2:2.10-6ubuntu2.2, 2:2.10-6ubuntu2.3), nvidia-l4t-kernel-oot-headers:arm64 (5.15.148-tegra-36.4.4-20250616085344, 5.15.148-tegra-36.4.7-20250918154033), nvidia-l4t-oem-config:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-jetson-multimedia-api:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-libwayland-egl1:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-wayland:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-kernel-oot-modules:arm64 (5.15.148-tegra-36.4.4-20250616085344, 5.15.148-tegra-36.4.7-20250918154033), nvidia-l4t-kernel:arm64 (5.15.148-tegra-36.4.4-20250616085344, 5.15.148-tegra-36.4.7-20250918154033), nvidia-l4t-graphics-demos:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), gir1.2-javascriptcoregtk-4.0:arm64 (2.48.5-0ubuntu0.22.04.1, 2.48.7-0ubuntu0.22.04.2), nvidia-l4t-3d-core:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-nvpmodel:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-cuda-utils:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), gir1.2-webkit2-4.0:arm64 (2.48.5-0ubuntu0.22.04.1, 2.48.7-0ubuntu0.22.04.2), libwbclient0:arm64 (2:4.15.13+dfsg-0ubuntu1.8, 2:4.15.13+dfsg-0ubuntu1.9), nvidia-l4t-tools:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), libsmbclient:arm64 (2:4.15.13+dfsg-0ubuntu1.8, 2:4.15.13+dfsg-0ubuntu1.9), nvidia-l4t-core:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-kernel-dtbs:arm64 (5.15.148-tegra-36.4.4-20250616085344, 5.15.148-tegra-36.4.7-20250918154033), nvidia-l4t-optee:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-cuda:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), poppler-utils:arm64 (22.02.0-2ubuntu0.10, 22.02.0-2ubuntu0.11), nvidia-l4t-dla-compiler:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-nvml:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), docker-ce:arm64 (5:28.5.0-1~ubuntu.22.04~jammy, 5:28.5.1-1~ubuntu.22.04~jammy), nvidia-l4t-openwfd:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-nvfancontrol:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-libwayland-client0:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-nvsci:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-init:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-gbm:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-jetsonpower-gui-tools:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-vulkan-sc:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-display-kernel:arm64 (5.15.148-tegra-36.4.4-20250616085344, 5.15.148-tegra-36.4.7-20250918154033), libpoppler-glib8:arm64 (22.02.0-2ubuntu0.10, 22.02.0-2ubuntu0.11), nvidia-l4t-configs:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-pva:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-multimedia:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-multimedia-utils:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-x11:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), docker-ce-rootless-extras:arm64 (5:28.5.0-1~ubuntu.22.04~jammy, 5:28.5.1-1~ubuntu.22.04~jammy), nvidia-l4t-apt-source:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), libpoppler118:arm64 (22.02.0-2ubuntu0.10, 22.02.0-2ubuntu0.11), nvidia-l4t-kernel-headers:arm64 (5.15.148-tegra-36.4.4-20250616085344, 5.15.148-tegra-36.4.7-20250918154033), nvidia-l4t-bootloader:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), snapd:arm64 (2.68.5+ubuntu22.04.1, 2.71+ubuntu22.04), nvidia-l4t-gstreamer:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), libjavascriptcoregtk-4.0-18:arm64 (2.48.5-0ubuntu0.22.04.1, 2.48.7-0ubuntu0.22.04.2), nvidia-l4t-camera:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-libwayland-cursor0:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-nvpmodel-gui-tools:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), samba-libs:arm64 (2:4.15.13+dfsg-0ubuntu1.8, 2:4.15.13+dfsg-0ubuntu1.9), nvidia-l4t-initrd:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), libwebkit2gtk-4.0-37:arm64 (2.48.5-0ubuntu0.22.04.1, 2.48.7-0ubuntu0.22.04.2), nvidia-l4t-libwayland-server0:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-xusb-firmware:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-jetson-io:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-vulkan-sc-dev:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033)

pdobrien3 · October 13, 2025, 12:30am

error from Open WebUI

500: llama runner process has terminated: cudaMalloc failed: out of memory ggml_gallocr_reserve_n: failed to allocate CUDA0 buffer of size 315359232

ramakrishnap · October 13, 2025, 2:40pm

@pdobrien3 think you chose wrong project for the issue, please select appropriate project.

TomNVIDIA · October 15, 2025, 7:07pm

Hello,

Thanks for visiting the NVIDIA Developer Forums.
To ensure better visibility and support, I’ve moved your post to the Jetson category where it’s more appropriate

Cheers,
Tom

AastaLLL · October 16, 2025, 6:33am

Hi,

Which container do you use?
Could you try our ollama image to see if it can work?

https://hub.docker.com/r/dustynv/ollama/tags

Thanks.

pdobrien3 · October 16, 2025, 9:17am

That is the image I use.

image: dustynv/ollama:0.6.8-r36.4-cu126-22.04

pdobrien3 · October 17, 2025, 10:09am

I think I have made some progress by adding the following environmental variables:

GGML_CUDA_ENABLE_UNIFIED_MEMORY=1

OLLAMA_GPU_OVERHEAD=16032385536

I can get queries to work and no longer error out using open-webui but frigate now times out. This is my environmental variables section:

  - NVIDIA_DRIVER_CAPABILITIES=compute,utility,graphics
  - PULSE_SERVER=unix:/run/user/1000/pulse/native
  - OLLAMA_HOST=0.0.0.0
  - OLLAMA_NUM_PARALLEL=1 
  - OLLAMA_MAX_QUEUE=256
  - OLLAMA_MAX_LOADED_MODELS=2
  - OLLAMA_FLASH_ATTENTION=1 
  - OLLAMA_LOAD_TIMEOUT=10m
  - OLLAMA_KEEP_ALIVE=1440m
  - GGML_CUDA_ENABLE_UNIFIED_MEMORY=1
  - OLLAMA_GPU_OVERHEAD=16032385536

Can anyone confirm/deny if these environmental variables are accurate for the Jetson Orin Nano?

AastaLLL · October 20, 2025, 6:06am

Hi,

For example, we can launch Gemma 3 4B with the below comment:

docker run -it --rm \
  -e OLLAMA_MODEL=gemma3:4b \
  -e OLLAMA_MODELS=/root/.ollama \
  -e OLLAMA_HOST=0.0.0.0:9000 \
  -e OLLAMA_CONTEXT_LEN=4096 \
  -e OLLAMA_LOGS=/root/.ollama/ollama.log \
  -v /mnt/nvme/cache/ollama:/root/.ollama \
  --gpus all \
  -p 9000:9000 \
  -e DOCKER_PULL=always --pull always \
  -e HF_TOKEN=${HF_TOKEN} \
  -e HF_HUB_CACHE=/root/.cache/huggingface \
  -v /mnt/nvme/cache:/root/.cache \
  dustynv/ollama:main-r36.4.0

It can work without adding GGML_CUDA_ENABLE_UNIFIED_MEMORY=1 configure.
But please try to add the --gpus all to allow GPU access within the container.

Thanks.

pdobrien3 · October 22, 2025, 11:56pm

appreciate the reply, unfortunately adding that did not help the situation. I am stumped because ollama works great when using open-webui. better than ever, but anything else that tries to use the ollama container times out when it didn’t before i upgraded.

pdobrien3 · October 24, 2025, 11:34am

these are the only logs around an attempt to analyze a picture/snapshot that fails:

time=2025-10-24T07:24:05.967-04:00 level=WARN source=ggml.go:152 msg=“key not found” key=general.alignment default=32
time=2025-10-24T07:24:05.975-04:00 level=WARN source=routes.go:281 msg=“the context field is deprecated and will be removed in a future version of Ollama”
[GIN] 2025/10/24 - 07:25:05 | 500 | 1m0s | 192.168.51.140 | POST “/api/generate”
[GIN] 2025/10/24 - 07:28:57 | 200 | 193.86µs | 192.168.51.134 | HEAD “/”

muttleydosomething · October 24, 2025, 1:28pm

I’m having the same issue. I’ve spent days trying to fix the install but I might have to give up. There are so many problems with the Orin Nano and being new to this doesn’t really help, out of date OS, security flaws… Wheels? not too sure what these are but I’m guessing some sort of pre-compiled binary.

I think I’m going to have to format and start again. Initially I was told the system is so new that appropriate software wasn’t available yet. A few months later and now it seems to be out of date and is unsupported.

Can I install Jetpack 7 on the Orin Nano or is it now out of date and unsupported? Maybe I should just chuck it away and not deal with NVIDIA until I’m able to spend $3-4k

alrough · October 24, 2025, 4:55pm

No, according to the roadmap, JetPack 6.3 will be released next year. JetPack 7 is not supported on the Orin Nano.
I only got my Orin Nano last week because I thought the software would be good, but there are still quite a few issues — especially with the new JetPack 6.2.1.
Something seems to be wrong with the RAM allocation; NVIDIA really needs to fix that.
At the moment, I’m not particularly happy about it, especially considering the price isn’t exactly cheap.
Hopefully, an update will be released sooner than expected.

muttleydosomething · October 24, 2025, 9:00pm

Exactly, and with only 8GB to play with any memory allocations problems are a serious problem

I thought I might abandon Ollama (or at least have it as a backup) and go with TensorRT which offers 1.5x better performance. But I was having problems installing the libraries because it seems the NVIDIA package download is blocked in the UK (either deliberately or by mistake).

I could sign up to their website but couldn’t confirm my email address unless I used the TOR browser, which is utterly ridiculous. I even went as far as installing openvpn so bypass the issues.
NVIDIA think they might know about hardware but they sure are clueless when it comes to networking and software support. Either that and they know but just don’t care…

There is their Discord but that isn’t for support - yet another ridiculous decision. I think they need to get their sh1t together.

muttleydosomething · October 24, 2025, 9:19pm

Another thing. With an NVIDIA GPU you can install any OS you want and then just install their drivers to get things up and running….

But with Jetson devices, NVIDIA forces you into their JetPack ecosystem, which is:

A monolithic BSP (Board Support Package) that bundles OS + kernel + drivers + CUDA + cuDNN + TensorRT
Based on Ubuntu (only the version they choose)
Requires their specific flashing tools and procedures
Makes it nearly impossible to use alternative distributions or keep your OS up-to-date independently

The technical reasons they cite (boot firmware, device tree configurations, kernel patches for Tegra SoC) are valid, but many in the community (myself included) argue NVIDIA could absolutely provide standalone drivers and make this more flexible - they just choose not to.

What the community wants:

Standard ARM64 Ubuntu/Debian/Fedora installation
apt install nvidia-jetson-drivers or similar
Separate CUDA/cuDNN packages you can update independently

What NVIDIA forces:

Download their 10GB+ JetPack SDK
Flash entire OS image with their tools
Wait months/years between updates
Can’t easily use newer Ubuntu versions or other distros

It’s a deliberate lock-in strategy that’s incredibly frustrating for anyone who wants flexibility in their infrastructure. Which we all do.

The Raspberry Pi ecosystem handles this so much better - standard Debian with additional repos for hardware-specific packages.

NVIDIA are literally responsible for holding up the development of AI. They are the problem not the solution. This has got to stop.

pdobrien3 · October 25, 2025, 11:05am

So, I am also new to the Orin. Is the expectation to install the Jetpack only and then wait for the next jetpack release? I thought the Jetpack was only a starting point to get things loaded correctly and then apt going forward for updates as it is Ubuntu?

AastaLLL · October 27, 2025, 7:48am

Hi, all

Thanks for your report.

When upgrading r36.4.4 to r36.4.7, there is an unknown issue that breaks ollama functionality.
We are actively checking this issue internally.

Will keep you updated on the status.
Sorry for the inconvenience.

AastaLLL · October 27, 2025, 7:52am

Hi, @muttleydosomething

JetPack 7 is only available for Thor now.
For Orin, please set up with JetPack 6.2.1.

TensorRT should be pre-installed with the SDK Manager by default.
If not, you can also install via the commands below:

$ sudo apt install nvidia-jetpack

Thanks

pdobrien3 · October 27, 2025, 11:21am

Just to clarify, reverting to dustynv/ollama:main-r36.4.0 did not fix my issue. I was successfully running dustynv/ollama:r36.4-cu129-24.04 for a while until I did an apt update apt upgrade and then something broke. I can still run things perfectly from open-webui installed through docker in the same host. It seems as if external connections do not work.

pdobrien3 · October 27, 2025, 11:26am

@AastaLLL Should I move forward here?

alrough · October 27, 2025, 1:11pm

I installed the Jetson 6.2.1 sources onto an NVMe drive using the SDK Manager.
Ollama runs natively. I can start llama3.2:4b, but only after clearing the RAM cache.

Topic		Replies	Views
Ollama errors orin nano Jetson Orin NX nvbugs , generative_ai	42	3027	February 12, 2026
"unable to allocate CUDA0 buffer" after Updating Ubuntu Packages Jetson Orin Nano cuda , jetson , generative_ai , llama	244	16708	March 13, 2026
Llama3.2:3b randomly outputting "GGGGGGGG" when running under ollama on Jetson Orin Nano Super (JP6.2) Jetson Orin Nano generative_ai	42	1300	February 11, 2026
JetPack 7.2 GPU acceleration issue Jetson Orin Nano cuda , gpu	33	924	July 7, 2026
Orin Nano - Ollama does not run Jetson Orin Nano generative_ai	2	451	October 28, 2025
Gemma3:4b not using the gpu while gemma3:1b does on orin Jetson Nano super Jetson Orin Nano generative_ai , llama	1	736	June 2, 2025
Ollama and Jetson issue Jetson Orin NX jetson-inference , generative_ai	11	6423	March 20, 2024
Introducing Ollama Support for Jetson Devices Jetson Projects cuda , natural-language-processing-nlp , artificialintelligence , interactive , docker-machine-learning , generative_ai	28	14343	August 13, 2024
Free up more RAM for Ollama (Jetson Orin Nano Super) Jetson Orin Nano generative_ai	6	1646	May 21, 2025
How to control amount of shared memory available to LLM on Jetson Thor? Jetson Thor generative_ai	20	1461	November 10, 2025

Updating Orin Nano breaks Ollama

Related topics