Ollama on Docker does not finmd GPU

nvidia3408 · February 28, 2025, 2:28pm

I have a fresh install of the latest JetPack on a new Jetson Orin Nano Dev Kit Super. Installed it via the desktop flash method yesterday (02/27/2025).

When loading Ollama via Docker, the models, even smaller ones, load 100% CPU. I get the folloing messages in the Ollama logs:

time=2025-02-28T14:05:25.428Z level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths=[/usr/lib/aarch64-linux-gnu/nvidia/libcuda.so.1.1]
initializing /usr/lib/aarch64-linux-gnu/nvidia/libcuda.so.1.1
library /usr/lib/aarch64-linux-gnu/nvidia/libcuda.so.1.1 load err: /usr/lib/aarch64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found (required by /usr/lib/aarch64-linux-gnu/nvidia/libnvrm_gpu.so)
time=2025-02-28T14:05:25.428Z level=INFO source=gpu.go:612 msg="Unable to load cudart library /usr/lib/aarch64-linux-gnu/nvidia/libcuda.so.1.1: Unable to load /usr/lib/aarch64-linux-gnu/nvidia/libcuda.so.1.1 library to query for Nvidia GPUs: /usr/lib/aarch64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found (required by /usr/lib/aarch64-linux-gnu/nvidia/libnvrm_gpu.so)"

time=2025-02-28T14:05:25.430Z level=DEBUG source=gpu.go:574 msg="Unable to load cudart library /usr/lib/ollama/cuda_v11/libcudart.so.11.3.109: your nvidia driver is too old or missing. If you have a CUDA GPU please upgrade to run ollama"

My docker-compose.yaml is farly simple, following the basic example:

services:
  openWebUI:
    container_name: open-webui
    # image: ghcr.io/open-webui/open-webui:main @TODO: Fix this once a PR solves the issue.
    image: ghcr.io/open-webui/open-webui:v0.5.16
    restart: unless-stopped
    ports:
      - "8080:8080"
    volumes:
      - /open-webui:/app/backend/data
    depends_on:
      - ollama
    environment:
      - WEBUI_AUTH=False
      - OLLAMA_BASE_URL=http://10.32.1.229:11434
      - GLOBAL_LOG_LEVEL=DEBUG 

  ollama:
    container_name: ollama
    image: ollama/ollama:latest
    # image: dustynv/ollama:main-r36.4.0
    runtime: nvidia
    pull_policy: always
    restart: unless-stopped
    ports:
      - "11434:11434"
    volumes:
      - /ollama:/root/.ollama
    environment:
      - OLLAMA_KEEP_ALIVE=24h
      - OLLAMA_HOST=0.0.0.0:11434
      - OLLAMA_DEBUG=1
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]

According to ldd, I am running GLIBC_2.35, so I would assume I meet the requirements.

ldd (Ubuntu GLIBC 2.35-0ubuntu3.8) 2.35
Copyright (C) 2022 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Written by Roland McGrath and Ulrich Drepper.

As far as I know, everything is current, aside from Docker which I am purposely holding back to 27.x because of issues with 28.x.

Any help would be greatly appreciated!

cyato · February 28, 2025, 5:55pm

Hi @nvidia3408 ,

The official Ollama installer checks the existence of /etc/nv_tegra_release file.

You can copy the file like this when bundling the container image.

github.com/dusty-nv/jetson-containers

packages/llm/ollama/Dockerfile

master


      
              PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin \
              LD_LIBRARY_PATH=/usr/local/cuda/lib:/usr/local/cuda/lib64:/usr/local/cuda/include:${LD_LIBRARY_PATH} \
              JETSON_JETPACK=${JETPACK_VERSION} \
              CUDA_VERSION_MAJOR=${CUDA_VERSION_MAJOR}
          
          RUN apt-get update && \
              apt install -y curl  && \
              rm -rf /var/lib/apt/lists/* && \
              apt-get clean
          
          COPY nv_tegra_release /etc/nv_tegra_release
          
          #ADD https://api.github.com/repos/ollama/ollama/branches/main /tmp/ollama_version.json
          RUN curl -H "Authorization: token ${GITHUB_TOKEN}" \
              -o /tmp/ollama_version.json \
              https://api.github.com/repos/ollama/ollama/branches/main
          
          RUN curl -fsSL https://ollama.com/install.sh | sh
          
          RUN ln -s /usr/bin/python3 /usr/bin/python

Here is the instruction for using the pre-build container.

Regards,

nvidia3408 · March 3, 2025, 11:09pm

I’ve messed with this all weekend, with no success.

When I try to run Ollama with jetson-containers, I still have an issue with the CUDA libraries, no GPU is found, and the Ollama models run in 100% CPU.


dave@ai:~/jetson-containers/packages/llm/ollama $ jetson-containers run ollama/ollama:latest
V4L2_DEVICES:
+ docker run --runtime nvidia -it --rm --network host --shm-size=8g --volume /tmp/argus_socket:/tmp/argus_socket --volume /etc/enctune.conf:/etc/enctune.conf --volume /etc/nv_tegra_release:/etc/nv_tegra_release --volume /tmp/nv_jetson_model:/tmp/nv_jetson_model --volume /var/run/dbus:/var/run/dbus --volume /var/run/avahi-daemon/socket:/var/run/avahi-daemon/socket --volume /var/run/docker.sock:/var/run/docker.sock --volume /home/dave/jetson-containers/data:/data -v /etc/localtime:/etc/localtime:ro -v /etc/timezone:/etc/timezone:ro --device /dev/snd -e PULSE_SERVER=unix:/run/user/1000/pulse/native -v /run/user/1000/pulse:/run/user/1000/pulse --device /dev/bus/usb --device /dev/i2c-0 --device /dev/i2c-1 --device /dev/i2c-2 --device /dev/i2c-4 --device /dev/i2c-5 --device /dev/i2c-7 --name jetson_container_20250303_180703 ollama/ollama:latest
Couldn't find '/root/.ollama/id_ed25519'. Generating new private key.
Your new public key is:

ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIJ/Sp3PDhLR/nXjeUyL6u5sPV8erMczdFhQK4/Xn95CO

2025/03/03 18:07:04 routes.go:1205: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/root/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://*] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"
time=2025-03-03T18:07:04.197-05:00 level=INFO source=images.go:432 msg="total blobs: 0"
time=2025-03-03T18:07:04.197-05:00 level=INFO source=images.go:439 msg="total unused blobs removed: 0"
time=2025-03-03T18:07:04.198-05:00 level=INFO source=routes.go:1256 msg="Listening on 0.0.0.0:11434 (version 0.5.12)"
time=2025-03-03T18:07:04.199-05:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs"
time=2025-03-03T18:07:04.202-05:00 level=INFO source=gpu.go:612 msg="Unable to load cudart library /usr/lib/aarch64-linux-gnu/nvidia/libcuda.so.1.1: Unable to load /usr/lib/aarch64-linux-gnu/nvidia/libcuda.so.1.1 library to query for Nvidia GPUs: /usr/lib/aarch64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found (required by /usr/lib/aarch64-linux-gnu/nvidia/libnvrm_gpu.so)"
time=2025-03-03T18:07:04.207-05:00 level=INFO source=gpu.go:377 msg="no compatible GPUs were discovered"
time=2025-03-03T18:07:04.207-05:00 level=INFO source=types.go:130 msg="inference compute" id=0 library=cpu variant="" compute="" driver=0.0 name="" total="7.4 GiB" available="6.0 GiB"

This works if I run Ollama on base metal, but I would much prefer to run it in a container.

nvidia3408 · March 5, 2025, 4:31pm

It appears this is an issue with Ollama.

I ran Ollama 0.5.12 (latest, at the time) and worked backwards through the versions until I found 0.5.7 worked as expected. Everything 0.5.8 and newer reports:

ollama      | time=2025-03-04T20:48:09.206Z level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths="[/usr/lib/ollama/cuda_v11/libcudart.so.11.3.109 /usr/lib/ollama/cuda_v12/libcudart.so.12.4.127]"
ollama      | cudaSetDevice err: 35
ollama      | time=2025-03-04T20:48:09.209Z level=DEBUG source=gpu.go:574 msg="Unable to load cudart library /usr/lib/ollama/cuda_v11/libcudart.so.11.3.109: your nvidia driver is too old or missing.  If you have a CUDA GPU please upgrade to run ollama"
ollama      | cudaSetDevice err: 35
ollama      | time=2025-03-04T20:48:09.210Z level=DEBUG source=gpu.go:574 msg="Unable to load cudart library /usr/lib/ollama/cuda_v12/libcudart.so.12.4.127: your nvidia driver is too old or missing.  If you have a CUDA GPU please upgrade to run ollama"
ollama      | time=2025-03-04T20:48:09.210Z level=DEBUG source=amd_linux.go:419 msg="amdgpu driver not detected /sys/module/amdgpu"
ollama      | time=2025-03-04T20:48:09.210Z level=INFO source=gpu.go:377 msg="no compatible GPUs were discovered"
ollama      | time=2025-03-04T20:48:09.211Z level=INFO source=types.go:130 msg="inference compute" id=0 library=cpu variant="" compute="" driver=0.0 name="" total="7.4 GiB" available="6.5 GiB"

system · March 19, 2025, 4:31pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Orin Nano - Ollama does not run Jetson Orin Nano generative_ai	3	279	October 28, 2025
Ollama and Jetson issue Jetson Orin NX jetson-inference , generative_ai	12	6159	March 20, 2024
Ollama unable to detect gpu on JetPack 6.1 Jetson AGX Orin generative_ai	7	1078	October 15, 2024
Running ai docker containers on jetson orin nano with gpu support Jetson Orin Nano docker , generative_ai	8	912	June 18, 2025
Installed Ollama container from jetson-containers, error: no compatible GPUs found Jetson AGX Xavier containers , generative_ai	5	333	April 9, 2025
Ollama Docker in Jetson AGX Orin Jetson AGX Orin docker , generative_ai	2	591	November 26, 2024
Ollama support for Jetson Nano Jetson Nano generative_ai	8	1456	April 1, 2025
Introducing Ollama Support for Jetson Devices Jetson Projects cuda , natural-language-processing-nlp , artificialintelligence , interactive , docker-machine-learning , generative_ai	29	13771	August 28, 2024
Updating Orin Nano breaks Ollama Jetson Orin Nano cuda , generative_ai	26	1300	December 11, 2025
Ollama errors orin nano Jetson Orin NX nvbugs , generative_ai	42	2041	February 12, 2026

Ollama on Docker does not finmd GPU

Related topics