Have no idea how to approach this after repeated tries reinstalling nvidia-docker and nvidia drivers. In an Ubuntu system I have nvidia drivers (460.27.04) and docker installed, and proceeded to install nvidia-docker2 to as an Ubuntu package. Usually this works with other computers. However in this case after installation, running docker with GPUs exposed returned a ‘could not select device driver’. Docker info does not even register the nvidia GPU runtime, although the binary nvidia-container-runtime is already installed.
Enabling ‘debug’ in docker settings also gave no extra information in dockerd’s service. I could find no logs anywhere else that proffers any other hint for this error.
$ docker -D run --rm --gpus all nvidia/cuda:11.0-base
DEBU[0000] [hijack] End of stdout
docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].
$ docker info | grep run grep run
Runtimes: runc
Default Runtime: runc
runc version:
WARNING: No swap limit support
$ dpkg -l | grep nvidia
ii libnvidia-container-tools 1.7.0-1 amd64 NVIDIA container runtime library (command-line tools)
ii libnvidia-container1:amd64 1.7.0-1 amd64 NVIDIA container runtime library
ii nvidia-container-toolkit 1.7.0-1 amd64 NVIDIA container runtime hook
ii nvidia-docker2 2.8.0-1 all nvidia-docker CLI wrapper
$ nvidia-smi
Tue Dec 21 10:05:50 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.27.04 Driver Version: 460.27.04 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
What could have gone wrong?