We are trying to nstall Nvidia docker using the steps listed in the link below:
We have verified that the CUDA 10.0 is installed correctly as can be seen in the below dump from the terminal
Before we executed the below command we ran the command with Cuda 10.0 as base i.e., docker run --runtime=nvidia --rm nvidia/cuda:10.0-base nvidia-smi (Got the same error for this as well)
When we execute docker run --runtime=nvidia --rm nvidia/cuda:9.0-base nvidia-smi we are getting the error as below:
docker: Error response from daemon: OCI runtime create failed: container_linux.go:344: starting container process caused "process_linux.go:424: container init caused "process_linux.go:407: running prestart hook 1 caused \"error running hook: exit status 1, stdout: , stderr: exec command: [/usr/bin/nvidia-container-cli --load-kmods configure --ldconfig=@/sbin/ldconfig.real --device=all --compute --utility --require=cuda>=9.0 --pid=6850 /var/lib/docker/overlay2/2cc9923c80fe972aeea84a85176845fc1cb2ad367161665faf5c4e00fbc4d966/merged]\\nnvidia-container-cli: initialization error: cuda error: no cuda-capable device is detected\\n\""": unknown.
What is the solution for fixing this error ?
riaz@riaz-X705UDR:~$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130
riaz@riaz-X705UDR:~$ docker volume ls -q -f driver=nvidia-docker | xargs -r -I{} -n1 docker ps -q -a -f volume={} | xargs -r docker rm -f
riaz@riaz-X705UDR:~$ sudo apt-get purge -y nvidia-docker
Reading package lists… Done
Building dependency tree
Reading state information… Done
Package ‘nvidia-docker’ is not installed, so not removed
The following packages were automatically installed and are no longer required:
libbsd0:i386 libdrm-amdgpu1:i386 libdrm-intel1:i386 libdrm-nouveau2:i386
libdrm-radeon1:i386 libdrm2:i386 libedit2:i386 libelf1:i386 libexpat1:i386
libffi6:i386 libgl1:i386 libgl1-mesa-dri:i386 libglapi-mesa:i386
libglvnd0:i386 libglx-mesa0:i386 libglx0:i386 libllvm6.0:i386
libnvidia-common-390 libpciaccess0:i386 libsensors4:i386 libstdc++6:i386
libwayland-client0:i386 libwayland-server0:i386 libx11-6:i386
libx11-xcb1:i386 libxau6:i386 libxcb-dri2-0:i386 libxcb-dri3-0:i386
libxcb-glx0:i386 libxcb-present0:i386 libxcb-sync1:i386 libxcb1:i386
libxdamage1:i386 libxdmcp6:i386 libxext6:i386 libxfixes3:i386
libxshmfence1:i386 libxxf86vm1:i386
Use ‘sudo apt autoremove’ to remove them.
0 upgraded, 0 newly installed, 0 to remove and 13 not upgraded.
riaz@riaz-X705UDR:~$ curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | \
sudo apt-key add -
OK
riaz@riaz-X705UDR:~$ distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
riaz@riaz-X705UDR:~$ curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list |
sudo tee /etc/apt/sources.list.d/nvidia-docker.list
deb https://nvidia.github.io/libnvidia-container/ubuntu18.04/$(ARCH) /
deb https://nvidia.github.io/nvidia-container-runtime/ubuntu18.04/$(ARCH) /
deb https://nvidia.github.io/nvidia-docker/ubuntu18.04/$(ARCH) /
riaz@riaz-X705UDR:~$ sudo apt-get update
Get:1 file:/var/cuda-repo-10-0-local-10.0.130-410.48 InRelease
Ign:1 file:/var/cuda-repo-10-0-local-10.0.130-410.48 InRelease
Get:2 file:/var/cuda-repo-10-0-local-10.0.130-410.48 Release [574 B]
Hit:3 Index of /deb stable InRelease
Get:2 file:/var/cuda-repo-10-0-local-10.0.130-410.48 Release [574 B]
Ign:4 http://dl.google.com/linux/chrome/deb stable InRelease
Hit:5 Index of linux/ubuntu/ bionic InRelease
Hit:6 Index of /ubuntu bionic InRelease
Get:7 Index of /ubuntu bionic-security InRelease [83.2 kB]
Hit:8 http://dl.google.com/linux/chrome/deb stable Release
Hit:9 https://nvidia.github.io/libnvidia-container/ubuntu18.04/amd64 InRelease
Get:11 Index of /ubuntu bionic-updates InRelease [88.7 kB]
Hit:12 https://nvidia.github.io/nvidia-container-runtime/ubuntu18.04/amd64 InRelease
Hit:13 https://nvidia.github.io/nvidia-docker/ubuntu18.04/amd64 InRelease
Get:14 Index of /ubuntu bionic-backports InRelease [74.6 kB]
Fetched 247 kB in 2s (107 kB/s)
Reading package lists… Done
riaz@riaz-X705UDR:~$ sudo apt-get install -y nvidia-docker2
Reading package lists… Done
Building dependency tree
Reading state information… Done
nvidia-docker2 is already the newest version (2.0.3+docker18.09.1-1).
The following packages were automatically installed and are no longer required:
libbsd0:i386 libdrm-amdgpu1:i386 libdrm-intel1:i386 libdrm-nouveau2:i386
libdrm-radeon1:i386 libdrm2:i386 libedit2:i386 libelf1:i386 libexpat1:i386
libffi6:i386 libgl1:i386 libgl1-mesa-dri:i386 libglapi-mesa:i386
libglvnd0:i386 libglx-mesa0:i386 libglx0:i386 libllvm6.0:i386
libnvidia-common-390 libpciaccess0:i386 libsensors4:i386 libstdc++6:i386
libwayland-client0:i386 libwayland-server0:i386 libx11-6:i386
libx11-xcb1:i386 libxau6:i386 libxcb-dri2-0:i386 libxcb-dri3-0:i386
libxcb-glx0:i386 libxcb-present0:i386 libxcb-sync1:i386 libxcb1:i386
libxdamage1:i386 libxdmcp6:i386 libxext6:i386 libxfixes3:i386
libxshmfence1:i386 libxxf86vm1:i386
Use ‘sudo apt autoremove’ to remove them.
0 upgraded, 0 newly installed, 0 to remove and 13 not upgraded.
riaz@riaz-X705UDR:~$ sudo pkill -SIGHUP dockerd
riaz@riaz-X705UDR:~$ docker run --runtime=nvidia --rm nvidia/cuda:9.0-base nvidia-smi
docker: Error response from daemon: OCI runtime create failed: container_linux.go:344: starting container process caused “process_linux.go:424: container init caused "process_linux.go:407: running prestart hook 1 caused \"error running hook: exit status 1, stdout: , stderr: exec command: [/usr/bin/nvidia-container-cli --load-kmods configure --ldconfig=@/sbin/ldconfig.real --device=all --compute --utility --require=cuda>=9.0 --pid=6850 /var/lib/docker/overlay2/2cc9923c80fe972aeea84a85176845fc1cb2ad367161665faf5c4e00fbc4d966/merged]\\nnvidia-container-cli: initialization error: cuda error: no cuda-capable device is detected\\n\""”: unknown.
riaz@riaz-X705UDR:~$