Ollama on Docker does not finmd GPU

I have a fresh install of the latest JetPack on a new Jetson Orin Nano Dev Kit Super. Installed it via the desktop flash method yesterday (02/27/2025).

When loading Ollama via Docker, the models, even smaller ones, load 100% CPU. I get the folloing messages in the Ollama logs:

time=2025-02-28T14:05:25.428Z level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths=[/usr/lib/aarch64-linux-gnu/nvidia/libcuda.so.1.1]
initializing /usr/lib/aarch64-linux-gnu/nvidia/libcuda.so.1.1
library /usr/lib/aarch64-linux-gnu/nvidia/libcuda.so.1.1 load err: /usr/lib/aarch64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found (required by /usr/lib/aarch64-linux-gnu/nvidia/libnvrm_gpu.so)
time=2025-02-28T14:05:25.428Z level=INFO source=gpu.go:612 msg="Unable to load cudart library /usr/lib/aarch64-linux-gnu/nvidia/libcuda.so.1.1: Unable to load /usr/lib/aarch64-linux-gnu/nvidia/libcuda.so.1.1 library to query for Nvidia GPUs: /usr/lib/aarch64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found (required by /usr/lib/aarch64-linux-gnu/nvidia/libnvrm_gpu.so)"

time=2025-02-28T14:05:25.430Z level=DEBUG source=gpu.go:574 msg="Unable to load cudart library /usr/lib/ollama/cuda_v11/libcudart.so.11.3.109: your nvidia driver is too old or missing. If you have a CUDA GPU please upgrade to run ollama"

My docker-compose.yaml is farly simple, following the basic example:

services:
  openWebUI:
    container_name: open-webui
    # image: ghcr.io/open-webui/open-webui:main @TODO: Fix this once a PR solves the issue.
    image: ghcr.io/open-webui/open-webui:v0.5.16
    restart: unless-stopped
    ports:
      - "8080:8080"
    volumes:
      - /open-webui:/app/backend/data
    depends_on:
      - ollama
    environment:
      - WEBUI_AUTH=False
      - OLLAMA_BASE_URL=http://10.32.1.229:11434
      - GLOBAL_LOG_LEVEL=DEBUG 

  ollama:
    container_name: ollama
    image: ollama/ollama:latest
    # image: dustynv/ollama:main-r36.4.0
    runtime: nvidia
    pull_policy: always
    restart: unless-stopped
    ports:
      - "11434:11434"
    volumes:
      - /ollama:/root/.ollama
    environment:
      - OLLAMA_KEEP_ALIVE=24h
      - OLLAMA_HOST=0.0.0.0:11434
      - OLLAMA_DEBUG=1
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]

According to ldd, I am running GLIBC_2.35, so I would assume I meet the requirements.

ldd (Ubuntu GLIBC 2.35-0ubuntu3.8) 2.35
Copyright (C) 2022 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Written by Roland McGrath and Ulrich Drepper.

As far as I know, everything is current, aside from Docker which I am purposely holding back to 27.x because of issues with 28.x.

Any help would be greatly appreciated!

Hi @nvidia3408 ,

The official Ollama installer checks the existence of /etc/nv_tegra_release file.

You can copy the file like this when bundling the container image.

Here is the instruction for using the pre-build container.

Regards,

I’ve messed with this all weekend, with no success.

When I try to run Ollama with jetson-containers, I still have an issue with the CUDA libraries, no GPU is found, and the Ollama models run in 100% CPU.


dave@ai:~/jetson-containers/packages/llm/ollama $ jetson-containers run ollama/ollama:latest
V4L2_DEVICES:
+ docker run --runtime nvidia -it --rm --network host --shm-size=8g --volume /tmp/argus_socket:/tmp/argus_socket --volume /etc/enctune.conf:/etc/enctune.conf --volume /etc/nv_tegra_release:/etc/nv_tegra_release --volume /tmp/nv_jetson_model:/tmp/nv_jetson_model --volume /var/run/dbus:/var/run/dbus --volume /var/run/avahi-daemon/socket:/var/run/avahi-daemon/socket --volume /var/run/docker.sock:/var/run/docker.sock --volume /home/dave/jetson-containers/data:/data -v /etc/localtime:/etc/localtime:ro -v /etc/timezone:/etc/timezone:ro --device /dev/snd -e PULSE_SERVER=unix:/run/user/1000/pulse/native -v /run/user/1000/pulse:/run/user/1000/pulse --device /dev/bus/usb --device /dev/i2c-0 --device /dev/i2c-1 --device /dev/i2c-2 --device /dev/i2c-4 --device /dev/i2c-5 --device /dev/i2c-7 --name jetson_container_20250303_180703 ollama/ollama:latest
Couldn't find '/root/.ollama/id_ed25519'. Generating new private key.
Your new public key is:

ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIJ/Sp3PDhLR/nXjeUyL6u5sPV8erMczdFhQK4/Xn95CO

2025/03/03 18:07:04 routes.go:1205: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/root/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://*] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"
time=2025-03-03T18:07:04.197-05:00 level=INFO source=images.go:432 msg="total blobs: 0"
time=2025-03-03T18:07:04.197-05:00 level=INFO source=images.go:439 msg="total unused blobs removed: 0"
time=2025-03-03T18:07:04.198-05:00 level=INFO source=routes.go:1256 msg="Listening on 0.0.0.0:11434 (version 0.5.12)"
time=2025-03-03T18:07:04.199-05:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs"
time=2025-03-03T18:07:04.202-05:00 level=INFO source=gpu.go:612 msg="Unable to load cudart library /usr/lib/aarch64-linux-gnu/nvidia/libcuda.so.1.1: Unable to load /usr/lib/aarch64-linux-gnu/nvidia/libcuda.so.1.1 library to query for Nvidia GPUs: /usr/lib/aarch64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found (required by /usr/lib/aarch64-linux-gnu/nvidia/libnvrm_gpu.so)"
time=2025-03-03T18:07:04.207-05:00 level=INFO source=gpu.go:377 msg="no compatible GPUs were discovered"
time=2025-03-03T18:07:04.207-05:00 level=INFO source=types.go:130 msg="inference compute" id=0 library=cpu variant="" compute="" driver=0.0 name="" total="7.4 GiB" available="6.0 GiB"

This works if I run Ollama on base metal, but I would much prefer to run it in a container.

It appears this is an issue with Ollama.

I ran Ollama 0.5.12 (latest, at the time) and worked backwards through the versions until I found 0.5.7 worked as expected. Everything 0.5.8 and newer reports:

ollama      | time=2025-03-04T20:48:09.206Z level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths="[/usr/lib/ollama/cuda_v11/libcudart.so.11.3.109 /usr/lib/ollama/cuda_v12/libcudart.so.12.4.127]"
ollama      | cudaSetDevice err: 35
ollama      | time=2025-03-04T20:48:09.209Z level=DEBUG source=gpu.go:574 msg="Unable to load cudart library /usr/lib/ollama/cuda_v11/libcudart.so.11.3.109: your nvidia driver is too old or missing.  If you have a CUDA GPU please upgrade to run ollama"
ollama      | cudaSetDevice err: 35
ollama      | time=2025-03-04T20:48:09.210Z level=DEBUG source=gpu.go:574 msg="Unable to load cudart library /usr/lib/ollama/cuda_v12/libcudart.so.12.4.127: your nvidia driver is too old or missing.  If you have a CUDA GPU please upgrade to run ollama"
ollama      | time=2025-03-04T20:48:09.210Z level=DEBUG source=amd_linux.go:419 msg="amdgpu driver not detected /sys/module/amdgpu"
ollama      | time=2025-03-04T20:48:09.210Z level=INFO source=gpu.go:377 msg="no compatible GPUs were discovered"
ollama      | time=2025-03-04T20:48:09.211Z level=INFO source=types.go:130 msg="inference compute" id=0 library=cpu variant="" compute="" driver=0.0 name="" total="7.4 GiB" available="6.5 GiB"

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.