CUDA sometimes not available in nvcr.io/nvidia/pytorch:25.09-py3

jld · November 5, 2025, 8:50pm

When running inside a docker container I sometimes (not consistent) get —

W1105 06:16:44.530000 75 torch/utils/cpp_extension.py:118] No CUDA runtime is found, using CUDA_HOME=‘/usr/local/cuda’

Given that this is a container specifically designed to have Cuda and PyTorch I find it surprising. It seems to be happen intermittently, possibly specific to one of the machines in the pool. I any case, nvidia-smi inside the container reports a healthy machine, with GPUs and Cuda.

Here is the full error output —

+ python benchmarks/benchmark_attn.py

/usr/local/lib/python3.12/dist-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you.
  import pynvml  # type: ignore[import]

/usr/local/lib/python3.12/dist-packages/torch/cuda/__init__.py:182: UserWarning: CUDA initialization: CUDA unknown error - this may be due to an incorrectly set up environment, e.g. changing env variable CUDA_VISIBLE_DEVICES after program start. Setting the available devices to be zero. (Triggered internally at /opt/pytorch/pytorch/c10/cuda/CUDAFunctions.cpp:109.)
  return torch._C._cuda_getDeviceCount() > 0

W1105 06:16:57.201000 284 torch/utils/cpp_extension.py:118] No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'

Traceback (most recent call last):
  File "/tmp/workspace/fa4/benchmarks/benchmark_attn.py", line 35, in <module>

    if torch.cuda.get_device_capability()[0] != 9:
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/usr/local/lib/python3.12/dist-packages/torch/cuda/__init__.py", line 598, in get_device_capability

    prop = get_device_properties(device)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/usr/local/lib/python3.12/dist-packages/torch/cuda/__init__.py", line 614, in get_device_properties

    _lazy_init()  # will define _get_device_properties
    ^^^^^^^^^^^^

  File "/usr/local/lib/python3.12/dist-packages/torch/cuda/__init__.py", line 410, in _lazy_init

    torch._C._cuda_init()

RuntimeError: CUDA unknown error - this may be due to an incorrectly set up environment, e.g. changing env variable CUDA_VISIBLE_DEVICES after program start. Setting the available devices to be zero.

zuckymiller · November 5, 2025, 11:52pm

Hey mate, I have run into that same CUDA error in Docker before while working on the Hypic Project it’s usually an environment setup issue. Try restarting the container with gpus all and make sure the NVIDIA runtime is properly enabled. Also check that CUDA_HOME and LD LIBRARY PATH are set correctly. A clean rebuild of the image often clears the intermittent no CUDA runtime warning.

jld · November 6, 2025, 5:03pm

This is an Nvidia-supplied container — nvcr.io/nvidia/pytorch:25.09-py3
I would expect it to be setup properly already.

I am launching it like so —
$ docker run --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 --tty --detach --security-opt seccomp=unconfined --shm-size=4g -v /home/jld/pytorch-integration-testing/pytorch-integration-testing:/tmp/workspace -w /tmp/workspace ``nvcr.io/nvidia/pytorch:25.09-py3

Topic		Replies	Views
No CUDA runtime is found, using CUDA_HOME=‘/usr/local/cuda’ Docker and NVIDIA Docker	0	2312	February 12, 2024
No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda' Jetson AGX Xavier cuda	8	49599	October 18, 2021
Docker container cannot find CUDA libraries (libcurand.so.10) Jetson Nano docker , jetson-nano	6	1587	April 26, 2023
WARNING: The NVIDIA Driver was not detected. GPU functionality will not be available. Use 'nvidia-docker run' to start this container; Docker and NVIDIA Docker pytorch	0	4031	March 24, 2022
Could NOT find CUDA (missing: CUDA_CUDART_LIBRARY) (found version "10.2") Jetson Nano cuda	16	10798	August 25, 2023
PyTorch can't find CUDA inside JetPack 5.1 docker container Jetson Xavier NX cuda , pytorch , python	3	1321	March 24, 2023
Pycuda Installation Issues in Docker Container Jetson Xavier NX cuda , pycuda	7	4464	October 18, 2021
CUDA Initialization in cuda docker container Docker and NVIDIA Docker	0	2224	July 28, 2022
L4t-ml:r32.4.3-py3 Import torch error Jetson Nano docker	4	1012	October 18, 2021
Docker accessing GPU for Pytorch error Docker and NVIDIA Docker cuda	0	809	July 29, 2021

CUDA sometimes not available in nvcr.io/nvidia/pytorch:25.09-py3

Related topics