PyTorch container 'pytorch:19.10-py3' fails to load CUDA: "This container was built for NVIDIA Driver Release 418.87 or later, but version 410.66 was detected and compatibility mode is UNAVAILABLE"

guidolr8gb · October 30, 2019, 9:02pm

I have just pulled the latest PyTorch container ‘pytorch:19.10-py3’ from: https://ngc.nvidia.com/catalog/containers/nvidia:pytorch. However when I connect to the container I get this error:

=============
== PyTorch ==
=============

NVIDIA Release 19.10 (build 8472689)
PyTorch Version 1.3.0a0+24ae9b5

Container image Copyright (c) 2019, NVIDIA CORPORATION.  All rights reserved.

Copyright (c) 2014-2019 Facebook Inc.
Copyright (c) 2011-2014 Idiap Research Institute (Ronan Collobert)
Copyright (c) 2012-2014 Deepmind Technologies    (Koray Kavukcuoglu)
Copyright (c) 2011-2012 NEC Laboratories America (Koray Kavukcuoglu)
Copyright (c) 2011-2013 NYU                      (Clement Farabet)
Copyright (c) 2006-2010 NEC Laboratories America (Ronan Collobert, Leon Bottou, Iain Melvin, Jason Weston)
Copyright (c) 2006      Idiap Research Institute (Samy Bengio)
Copyright (c) 2001-2004 Idiap Research Institute (Ronan Collobert, Samy Bengio, Johnny Mariethoz)
Copyright (c) 2015      Google Inc.
Copyright (c) 2015      Yangqing Jia
Copyright (c) 2013-2016 The Caffe contributors
All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION.  All rights reserved.
NVIDIA modifications are covered by the license terms that apply to the underlying project or file.

ERROR: This container was built for NVIDIA Driver Release 418.87 or later, but
       version 410.66 was detected and compatibility mode is UNAVAILABLE.

       [[CUDA Driver UNAVAILABLE (cuInit(0) returned 804)]]

NOTE: MOFED driver for multi-node communication was not detected.
      Multi-node communication performance may be reduced.

nvcc --version shows this:

root@b7275b4dd0fd:/workspace# nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243

and nvidia-smi shows this:

root@b7275b4dd0fd:/workspace# nvidia-smi
Wed Oct 30 20:50:18 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.66       Driver Version: 410.66       CUDA Version: 10.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 108...  Off  | 00000000:0A:00.0 Off |                  N/A |
| 25%   45C    P8    18W / 250W |      0MiB / 11177MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 108...  Off  | 00000000:0B:00.0 Off |                  N/A |
| 23%   29C    P8    10W / 250W |      0MiB / 11178MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Is it possible that the container was built with the wrong driver version?

I have tried restarting the container, even the host and the error remains.

Thanks

guidolr8gb · October 30, 2019, 9:26pm

Apologies, after posting this I upgraded the hosts’ driver and this appears to fix the problem. It appears the container is using the hosts’ driver rather having its own, which makes sense.

For anyone else stumbling across this, just follow these steps to upgrade host drivers:

[url]https://www.maketecheasier.com/install-nvidia-drivers-ubuntu/[/url]

Topic		Replies	Views
How to enable CUDA Minor Version Compatibility for/in a nvidia/cuda docker image Container: CUDA	0	170	April 14, 2025
CUDA Incompatibility With PyTorch CUDA Setup and Installation cuda	2	1596	August 7, 2024
Rootless Docker; ERROR: No supported GPU(s) detected to run this container Docker and NVIDIA Docker docker	2	7837	April 8, 2022
CUDA forward compatibility miracle with Nvidia container on Docker CUDA Setup and Installation	1	1588	December 4, 2021
The NVIDIA Driver was not detected. Driver 470.76 Cuda 11.4 CUDA on Windows Subsystem for Linux	3	4505	October 12, 2021
CUDA driver initialization failed CUDA Setup and Installation cuda , ubuntu , python	0	1954	June 7, 2023
No CUDA-capable device is detected - yolov4 TAO Toolkit	10	155	August 16, 2024
GH200 Cuda not available on pytorch CUDA Programming and Performance	4	1200	April 2, 2024
WARNING: The NVIDIA Driver was not detected. GPU functionality will not be available. Use 'nvidia-docker run' to start this container; Docker and NVIDIA Docker pytorch	0	3936	March 24, 2022
Inconsistent conda environment in nvidia pytorch container Docker and NVIDIA Docker nvbugs	1	1979	September 6, 2022

PyTorch container 'pytorch:19.10-py3' fails to load CUDA: "This container was built for NVIDIA Driver Release 418.87 or later, but version 410.66 was detected and compatibility mode is UNAVAILABLE"

Related topics