TRT LLM for Inference - Import Error libcuda

magnus.lars.andersson · January 4, 2026, 8:33am

Hi,

When trying to run the enclosed docker compose for TRT LLM I end up with the following error. Any suggestions why this is the case, since as far as I can tell the docker file is aligned with the one at TRT LLM for Inference | DGX Spark ?

/magnus

docker-compose.txt (1.2 KB)

Fetching 18 files: 100%|██████████| 18/18 [09:41<00:00, 32.33s/it]
/root/.cache/huggingface/hub/models–openai–gpt-oss-20b/snapshots/6cee5e81ee83917806bbde320786a8fb61efebee
/usr/local/lib/python3.12/dist-packages/torch/cuda/init.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you.
import pynvml # type: ignore[import]
Traceback (most recent call last):
File “/usr/local/bin/trtllm-serve”, line 3, in
from tensorrt_llm.commands.serve import main
File “/usr/local/lib/python3.12/dist-packages/tensorrt_llm/init.py”, line 70, in
import tensorrt_llm._torch.models as torch_models
File “/usr/local/lib/python3.12/dist-packages/tensorrt_llm/_torch/init.py”, line 1, in
from .llm import LLM
File “/usr/local/lib/python3.12/dist-packages/tensorrt_llm/_torch/llm.py”, line 1, in
from tensorrt_llm.llmapi.llm import _TorchLLM
File “/usr/local/lib/python3.12/dist-packages/tensorrt_llm/llmapi/init.py”, line 1, in
from .._torch.async_llm import AsyncLLM
File “/usr/local/lib/python3.12/dist-packages/tensorrt_llm/_torch/async_llm.py”, line 3, in
from ..llmapi.llm import LLM
File “/usr/local/lib/python3.12/dist-packages/tensorrt_llm/llmapi/llm.py”, line 17, in
from tensorrt_llm._utils import mpi_disabled
File “/usr/local/lib/python3.12/dist-packages/tensorrt_llm/_utils.py”, line 45, in
from tensorrt_llm.bindings import DataType, GptJsonConfig, LayerType
ImportError: libcuda.so.1: cannot open shared object file: No such file or directory

raphael.amorim · January 4, 2026, 2:58pm

First check if host sees GPU

nvidia-smi

A known-good CUDA image sees GPU inside a container:

docker run --rm --gpus all nvidia/cuda:12.4.1-base-ubuntu22.04 nvidia-smi

if this fails, fix NVIDIA Container Toolkit / Docker runtime configuration first

Docker Compose v2 supports GPUs via gpus: / device requests. Docker’s docs show the supported patterns.

Here’s the minimal change (keep the rest of your service as-is):

services:
  trtllm_llm_server:
    image: ${DOCKER_IMAGE}
    network_mode: host
    ipc: host
    restart: unless-stopped

    # this is what you’re missing
    gpus: all

    environment:
      HF_TOKEN: ${HF_TOKEN}
      MODEL_HANDLE: ${MODEL_HANDLE}
      TIKTOKEN_ENCODINGS_BASE: ${TIKTOKEN_ENCODINGS_BASE}
      NVIDIA_DRIVER_CAPABILITIES: compute,utility   # optional but common
      NVIDIA_VISIBLE_DEVICES: all                   # optional but explicit

    volumes:
      - ${HOME}/.cache/huggingface:/root/.cache/huggingface

    command: >
      bash -lc '... your existing command ...'

magnus.lars.andersson · January 5, 2026, 8:21am

Hi,

It was the missing gpus:all that was the problem.

However, noted that running above with openai/gpt-oss-20b eats up >90GB of memory.
Is this really right? I’m new to trt-llm and have not yet worked out all settings yet, but this sounds far more than expected.

Topic		Replies	Views
We are getting error on L40s GPU model while running tensorrt image Docker and NVIDIA Docker tensorrt , cuda , ubuntu , nvidia-smi	0	699	February 14, 2024
TensorRT LLM Jetson Thor tensorrt , generative_ai	6	824	November 18, 2025
Error When Try to compile llama3 checkpoint using trtllm-build TensorRT tensorrt , cuda , cudnn	1	566	May 30, 2024
Trt_pose model in docker: ImportError: libnvmedia_tensor.so: cannot open shared object file: No such file or directory Jetson Nano tensorrt , dla	7	1098	May 3, 2023
TensorRT-LLM for jetson errors Jetson AGX Orin generative_ai , paligemma , kosmos-2 , llama	14	1077	January 16, 2025
Tensorrt-llm 1.3.0rc5 DGX Spark / GB10	5	375	April 24, 2026
DGX Spark crashes when running tensorrt-llm DGX Spark / GB10 llama	3	232	March 7, 2026
Issue with tensorrt:r8.2.1 l4t container, Import error libnvmedia.so: cannot open shared object file: Jetson TX1 tensorrt , cuda , ubuntu , docker , python	9	2363	September 14, 2022
RuntimeError: Failed to dlopen libcuda.so.1 \|\| Running Llama 3.3 70B Models nim , llama	1	270	February 17, 2025
ImportError: No module named 'tensorrt' TensorRT	5	4103	October 12, 2021

TRT LLM for Inference - Import Error libcuda

Related topics