Updating Orin Nano breaks Ollama

First, my apologies if I didn’t pick the right area to post. I run an Ollama docker container that I use for both Home Assistant and Frigate. I recently updated my Orin Nano using apt and now I am getting all kinds of out of memory errors.

Here is what was updated:

(nvidia-l4t-weston:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), containerd.io:arm64 (1.7.28-0~ubuntu.22.04~jammy, 1.7.28-1~ubuntu.22.04~jammy), docker-compose-plugin:arm64 (2.39.4-0~ubuntu.22.04~jammy, 2.40.0-1~ubuntu.22.04~jammy), nvidia-l4t-vulkan-sc-samples:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), docker-ce-cli:arm64 (5:28.5.0-1~ubuntu.22.04~jammy, 5:28.5.1-1~ubuntu.22.04~jammy), nvidia-l4t-vulkan-sc-sdk:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-firmware:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), wpasupplicant:arm64 (2:2.10-6ubuntu2.2, 2:2.10-6ubuntu2.3), nvidia-l4t-kernel-oot-headers:arm64 (5.15.148-tegra-36.4.4-20250616085344, 5.15.148-tegra-36.4.7-20250918154033), nvidia-l4t-oem-config:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-jetson-multimedia-api:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-libwayland-egl1:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-wayland:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-kernel-oot-modules:arm64 (5.15.148-tegra-36.4.4-20250616085344, 5.15.148-tegra-36.4.7-20250918154033), nvidia-l4t-kernel:arm64 (5.15.148-tegra-36.4.4-20250616085344, 5.15.148-tegra-36.4.7-20250918154033), nvidia-l4t-graphics-demos:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), gir1.2-javascriptcoregtk-4.0:arm64 (2.48.5-0ubuntu0.22.04.1, 2.48.7-0ubuntu0.22.04.2), nvidia-l4t-3d-core:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-nvpmodel:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-cuda-utils:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), gir1.2-webkit2-4.0:arm64 (2.48.5-0ubuntu0.22.04.1, 2.48.7-0ubuntu0.22.04.2), libwbclient0:arm64 (2:4.15.13+dfsg-0ubuntu1.8, 2:4.15.13+dfsg-0ubuntu1.9), nvidia-l4t-tools:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), libsmbclient:arm64 (2:4.15.13+dfsg-0ubuntu1.8, 2:4.15.13+dfsg-0ubuntu1.9), nvidia-l4t-core:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-kernel-dtbs:arm64 (5.15.148-tegra-36.4.4-20250616085344, 5.15.148-tegra-36.4.7-20250918154033), nvidia-l4t-optee:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-cuda:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), poppler-utils:arm64 (22.02.0-2ubuntu0.10, 22.02.0-2ubuntu0.11), nvidia-l4t-dla-compiler:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-nvml:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), docker-ce:arm64 (5:28.5.0-1~ubuntu.22.04~jammy, 5:28.5.1-1~ubuntu.22.04~jammy), nvidia-l4t-openwfd:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-nvfancontrol:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-libwayland-client0:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-nvsci:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-init:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-gbm:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-jetsonpower-gui-tools:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-vulkan-sc:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-display-kernel:arm64 (5.15.148-tegra-36.4.4-20250616085344, 5.15.148-tegra-36.4.7-20250918154033), libpoppler-glib8:arm64 (22.02.0-2ubuntu0.10, 22.02.0-2ubuntu0.11), nvidia-l4t-configs:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-pva:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-multimedia:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-multimedia-utils:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-x11:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), docker-ce-rootless-extras:arm64 (5:28.5.0-1~ubuntu.22.04~jammy, 5:28.5.1-1~ubuntu.22.04~jammy), nvidia-l4t-apt-source:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), libpoppler118:arm64 (22.02.0-2ubuntu0.10, 22.02.0-2ubuntu0.11), nvidia-l4t-kernel-headers:arm64 (5.15.148-tegra-36.4.4-20250616085344, 5.15.148-tegra-36.4.7-20250918154033), nvidia-l4t-bootloader:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), snapd:arm64 (2.68.5+ubuntu22.04.1, 2.71+ubuntu22.04), nvidia-l4t-gstreamer:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), libjavascriptcoregtk-4.0-18:arm64 (2.48.5-0ubuntu0.22.04.1, 2.48.7-0ubuntu0.22.04.2), nvidia-l4t-camera:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-libwayland-cursor0:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-nvpmodel-gui-tools:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), samba-libs:arm64 (2:4.15.13+dfsg-0ubuntu1.8, 2:4.15.13+dfsg-0ubuntu1.9), nvidia-l4t-initrd:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), libwebkit2gtk-4.0-37:arm64 (2.48.5-0ubuntu0.22.04.1, 2.48.7-0ubuntu0.22.04.2), nvidia-l4t-libwayland-server0:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-xusb-firmware:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-jetson-io:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033), nvidia-l4t-vulkan-sc-dev:arm64 (36.4.4-20250616085344, 36.4.7-20250918154033)

2 Likes

error from Open WebUI

500: llama runner process has terminated: cudaMalloc failed: out of memory ggml_gallocr_reserve_n: failed to allocate CUDA0 buffer of size 315359232

@pdobrien3 think you chose wrong project for the issue, please select appropriate project.

Hello,

Thanks for visiting the NVIDIA Developer Forums.
To ensure better visibility and support, I’ve moved your post to the Jetson category where it’s more appropriate

Cheers,
Tom

1 Like

Hi,

Which container do you use?
Could you try our ollama image to see if it can work?

https://hub.docker.com/r/dustynv/ollama/tags

Thanks.

That is the image I use.

image: dustynv/ollama:0.6.8-r36.4-cu126-22.04

I think I have made some progress by adding the following environmental variables:

GGML_CUDA_ENABLE_UNIFIED_MEMORY=1

OLLAMA_GPU_OVERHEAD=16032385536

I can get queries to work and no longer error out using open-webui but frigate now times out. This is my environmental variables section:

  - NVIDIA_DRIVER_CAPABILITIES=compute,utility,graphics
  - PULSE_SERVER=unix:/run/user/1000/pulse/native
  - OLLAMA_HOST=0.0.0.0
  - OLLAMA_NUM_PARALLEL=1 
  - OLLAMA_MAX_QUEUE=256
  - OLLAMA_MAX_LOADED_MODELS=2
  - OLLAMA_FLASH_ATTENTION=1 
  - OLLAMA_LOAD_TIMEOUT=10m
  - OLLAMA_KEEP_ALIVE=1440m
  - GGML_CUDA_ENABLE_UNIFIED_MEMORY=1
  - OLLAMA_GPU_OVERHEAD=16032385536

Can anyone confirm/deny if these environmental variables are accurate for the Jetson Orin Nano?

Hi,

For example, we can launch Gemma 3 4B with the below comment:

docker run -it --rm \
  -e OLLAMA_MODEL=gemma3:4b \
  -e OLLAMA_MODELS=/root/.ollama \
  -e OLLAMA_HOST=0.0.0.0:9000 \
  -e OLLAMA_CONTEXT_LEN=4096 \
  -e OLLAMA_LOGS=/root/.ollama/ollama.log \
  -v /mnt/nvme/cache/ollama:/root/.ollama \
  --gpus all \
  -p 9000:9000 \
  -e DOCKER_PULL=always --pull always \
  -e HF_TOKEN=${HF_TOKEN} \
  -e HF_HUB_CACHE=/root/.cache/huggingface \
  -v /mnt/nvme/cache:/root/.cache \
  dustynv/ollama:main-r36.4.0

It can work without adding GGML_CUDA_ENABLE_UNIFIED_MEMORY=1 configure.
But please try to add the --gpus all to allow GPU access within the container.

Thanks.

appreciate the reply, unfortunately adding that did not help the situation. I am stumped because ollama works great when using open-webui. better than ever, but anything else that tries to use the ollama container times out when it didn’t before i upgraded.

1 Like

these are the only logs around an attempt to analyze a picture/snapshot that fails:

time=2025-10-24T07:24:05.967-04:00 level=WARN source=ggml.go:152 msg=“key not found” key=general.alignment default=32
time=2025-10-24T07:24:05.975-04:00 level=WARN source=routes.go:281 msg=“the context field is deprecated and will be removed in a future version of Ollama”
[GIN] 2025/10/24 - 07:25:05 | 500 | 1m0s | 192.168.51.140 | POST “/api/generate”
[GIN] 2025/10/24 - 07:28:57 | 200 | 193.86µs | 192.168.51.134 | HEAD “/”

I’m having the same issue. I’ve spent days trying to fix the install but I might have to give up. There are so many problems with the Orin Nano and being new to this doesn’t really help, out of date OS, security flaws… Wheels? not too sure what these are but I’m guessing some sort of pre-compiled binary.

I think I’m going to have to format and start again. Initially I was told the system is so new that appropriate software wasn’t available yet. A few months later and now it seems to be out of date and is unsupported.

Can I install Jetpack 7 on the Orin Nano or is it now out of date and unsupported? Maybe I should just chuck it away and not deal with NVIDIA until I’m able to spend $3-4k

No, according to the roadmap, JetPack 6.3 will be released next year. JetPack 7 is not supported on the Orin Nano.
I only got my Orin Nano last week because I thought the software would be good, but there are still quite a few issues — especially with the new JetPack 6.2.1.
Something seems to be wrong with the RAM allocation; NVIDIA really needs to fix that.
At the moment, I’m not particularly happy about it, especially considering the price isn’t exactly cheap.
Hopefully, an update will be released sooner than expected.

1 Like

Exactly, and with only 8GB to play with any memory allocations problems are a serious problem

I thought I might abandon Ollama (or at least have it as a backup) and go with TensorRT which offers 1.5x better performance. But I was having problems installing the libraries because it seems the NVIDIA package download is blocked in the UK (either deliberately or by mistake).

I could sign up to their website but couldn’t confirm my email address unless I used the TOR browser, which is utterly ridiculous. I even went as far as installing openvpn so bypass the issues.
NVIDIA think they might know about hardware but they sure are clueless when it comes to networking and software support. Either that and they know but just don’t care…

There is their Discord but that isn’t for support - yet another ridiculous decision. I think they need to get their sh1t together.

1 Like

Another thing. With an NVIDIA GPU you can install any OS you want and then just install their drivers to get things up and running….

But with Jetson devices, NVIDIA forces you into their JetPack ecosystem, which is:

  • A monolithic BSP (Board Support Package) that bundles OS + kernel + drivers + CUDA + cuDNN + TensorRT
  • Based on Ubuntu (only the version they choose)
  • Requires their specific flashing tools and procedures
  • Makes it nearly impossible to use alternative distributions or keep your OS up-to-date independently

The technical reasons they cite (boot firmware, device tree configurations, kernel patches for Tegra SoC) are valid, but many in the community (myself included) argue NVIDIA could absolutely provide standalone drivers and make this more flexible - they just choose not to.

What the community wants:

  • Standard ARM64 Ubuntu/Debian/Fedora installation
  • apt install nvidia-jetson-drivers or similar
  • Separate CUDA/cuDNN packages you can update independently

What NVIDIA forces:

  • Download their 10GB+ JetPack SDK
  • Flash entire OS image with their tools
  • Wait months/years between updates
  • Can’t easily use newer Ubuntu versions or other distros

It’s a deliberate lock-in strategy that’s incredibly frustrating for anyone who wants flexibility in their infrastructure. Which we all do.

The Raspberry Pi ecosystem handles this so much better - standard Debian with additional repos for hardware-specific packages.

NVIDIA are literally responsible for holding up the development of AI. They are the problem not the solution. This has got to stop.

1 Like

So, I am also new to the Orin. Is the expectation to install the Jetpack only and then wait for the next jetpack release? I thought the Jetpack was only a starting point to get things loaded correctly and then apt going forward for updates as it is Ubuntu?

Hi, all

Thanks for your report.

When upgrading r36.4.4 to r36.4.7, there is an unknown issue that breaks ollama functionality.
We are actively checking this issue internally.

Will keep you updated on the status.
Sorry for the inconvenience.

1 Like

Hi, @muttleydosomething

JetPack 7 is only available for Thor now.
For Orin, please set up with JetPack 6.2.1.

TensorRT should be pre-installed with the SDK Manager by default.
If not, you can also install via the commands below:

$ sudo apt install nvidia-jetpack

Thanks

1 Like

Just to clarify, reverting to dustynv/ollama:main-r36.4.0 did not fix my issue. I was successfully running dustynv/ollama:r36.4-cu129-24.04 for a while until I did an apt update apt upgrade and then something broke. I can still run things perfectly from open-webui installed through docker in the same host. It seems as if external connections do not work.

1 Like

@AastaLLL Should I move forward here?

I installed the Jetson 6.2.1 sources onto an NVMe drive using the SDK Manager.
Ollama runs natively. I can start llama3.2:4b, but only after clearing the RAM cache.