Yet another "Driver Not Loaded / can't communicate with the NVIDIA driver" error while trying to deploy a docker container with GPU support on WSL2

marietto2008 · May 8, 2021, 11:54pm

Hello.

I’m trying to deploy a docker container with GPU support on Windows Subsystem for Linux.

These are the commands that I have issued (taken from here : https://dilililabs.com/zh/blog/2021/01/26/deploying-docker-with-gpu-support-on-windows-subsystem-for-linux/

sudo apt-key adv --fetch-keys http://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/7fa2af80.pub sudo sh -c 'echo "deb 

http://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64 /" > /etc/apt/sources.list.d/cuda.list' sudo apt-get update

sudo apt-get install cuda-toolkit-11-0
curl https://get.docker.com | sh
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)

curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -

curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

curl -s -L https://nvidia.github.io/libnvidia-container/experimental/$distribution/libnvidia-container-experimental.list | sudo tee /etc/apt/sources.list.d/libnvidia-container-experimental.list

sudo apt-get update

sudo apt-get install nvidia-docker2 cuda-toolkit-11-0 cuda-drivers

sudo service docker start

I’m not able to run this docker container :

docker run --rm --gpus all nvidia/cuda:11.0-cudnn8-devel-ubuntu18.04

Unable to find image 'nvidia/cuda:11.0-cudnn8-devel-ubuntu18.04' locally
11.0-cudnn8-devel-ubuntu18.04: Pulling from nvidia/cuda
171857c49d0f: Pull complete
419640447d26: Pull complete
61e52f862619: Pull complete
2a93278deddf: Pull complete
c9f080049843: Pull complete
8189556b2329: Pull complete
c306a0c97a55: Pull complete
4a9478bd0b24: Pull complete
19a76c31766d: Pull complete
Digest: sha256:11777cee30f0bbd7cb4a3da562fdd0926adb2af02069dad7cf2e339ec1dad036
Status: Downloaded newer image for nvidia/cuda:11.0-cudnn8-devel-ubuntu18.04
docker: Error response from daemon: OCI runtime create failed: container_linux.go:367: starting container process caused: process_linux.go:495: container init caused: Running hook #0:: error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: initialization error: driver error: failed to process request: unknown.

in addition :

root@DESKTOP-N9UN2H3:/mnt/c/Program Files/cmder# nvidia-smi

NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

Failed to properly shut down NVML: Driver Not Loaded

(I’m using windows 10 build 21376co_release.210503-1432

On the host I have installed the nvidia driver vers. 470.14 and inside WSL2 I have ubuntu 20.04.

onomatopellan · May 9, 2021, 6:10pm

Known bug in latest nvidia docker libs. It will be fixed in an upcoming windows driver but for now as a workaround see nvidia-docker 2.6.0-1 - not working on Ubuntu WSL2 · Issue #1496 · NVIDIA/nvidia-docker (github.com)

marietto2008 · May 9, 2021, 6:38pm

the workaround is to setup : NVIDIA_DISABLE_REQUIRE=1? what happens if I disable NVIDIA ? which function I will lose ? It does not seem a workaround if I disable the function. I’ve used the standalone version of docker for Windows and it worked. With that I can run correctly the containers with GPU support. This seems to be the real workaround.

onomatopellan · May 9, 2021, 9:06pm

No, the workaround is installing the previous libraries version with
sudo apt-get install nvidia-docker2:amd64=2.5.0-1 nvidia-container-runtime:amd64=3.4.0-1 nvidia-container-toolkit:amd64=1.4.2-1 libnvidia-container-tools:amd64=1.3.3-1 libnvidia-container1:amd64=1.3.3-1

NVIDIA_DISABLE_REQUIRE=1 doesn’t disable anything important, it just ignores the CUDA version check. It’s needed because in WSL2 the CUDA version is always incorrectly reported as version 11 by docker.

marietto2008 · May 9, 2021, 9:20pm

what do u think about Docker for windows ? Does it supports the GPU or not ? I read yes.

onomatopellan · May 9, 2021, 9:23pm

I’m using Docker Desktop 3.3.1 and GPU works because it uses older nvidia libraries. You may need NVIDIA_DISABLE_REQUIRE=1 depending of the docker image you are running.

marietto2008 · May 9, 2021, 9:26pm

I’ve just downgraded the packages like you have suggested and this is what happened :

root@DESKTOP-N9UN2H3:/mnt/c/Program Files/cmder# nvidia-smi

NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

Failed to properly shut down NVML: Driver Not Loaded

root@DESKTOP-N9UN2H3:/mnt/c/Program Files/cmder# docker run --rm --gpus all nvidia/cuda:11.0-cudnn8-devel-ubuntu18.04

docker: Error response from daemon: dial unix /mnt/wsl/docker-desktop/shared-sockets/guest-services/docker.sock: connect: no such file or directory.
See ‘docker run --help’.

marietto2008 · May 9, 2021, 9:28pm

UPDATE :

root@DESKTOP-N9UN2H3:/mnt/c/Program Files/cmder# sudo service docker start

Starting Docker: docker
[ OK ] root@DESKTOP-N9UN2H3:/mnt/c/Program Files/cmder# docker run --rm --gpus all nvidia/cuda:11.0-cudnn8-devel-ubuntu18.04

nothing happened. How can I login into the docker image ?

onomatopellan · May 9, 2021, 9:30pm

nvidia-smi is broken and next driver update should fix it.

It looks like you have both the Nvidia docker and Docker Desktop. You can’t use both at the same time. Go to Docker Desktop options RESOURCES → WSL INTEGRATION and disable docker for the WSL2 distro you are running and try again.

marietto2008 · May 9, 2021, 9:50pm

like this ?

this is what happens :

root@DESKTOP-N9UN2H3:/mnt/c/Program Files/cmder# sudo service docker start

Starting Docker: docker [ OK ]

root@DESKTOP-N9UN2H3:/mnt/c/Program Files/cmder# docker run --rm --gpus all nvidia/cuda:11.0-cudnn8-devel-ubuntu18.04

nothing.

onomatopellan · May 9, 2021, 9:55pm

You can login to the docker image with
docker run --rm -it --gpus all nvidia/cuda:11.0-cudnn8-devel-ubuntu18.04 bash

Yes, after terminating and relaunching the running WSL2 distro it should work. What’s the output of ls -la $(which docker) ?

marietto2008 · May 9, 2021, 10:06pm

with the bash argument at the end of the command it worked…you are great.

Topic		Replies	Views
Hiccups setting up WSL2 + CUDA CUDA on Windows Subsystem for Linux	19	9863	October 12, 2021
Stderr: nvidia-container-cli: initialization error: driver error: failed to process request\\\\n\\\"\"": unknown CUDA on Windows Subsystem for Linux	35	37619	August 21, 2023
470.14 - WSL with W10 Build 21343 - NVIDIA-SMI error CUDA on Windows Subsystem for Linux	43	18968	November 21, 2021
Docker and nvidia-smi not working with clean install on Driver 470.14 and Insider Preview (Build 21343) Ubuntu 20.04 CUDA on Windows Subsystem for Linux	3	5597	April 17, 2021
Guide to run CUDA + WSL + Docker with latest versions (21382 Windows build + 470.14 Nvidia) CUDA on Windows Subsystem for Linux cuda , wsl	22	33853	December 9, 2023
Nvidia-smi can't communicate with driver -- docker-desktop conflict? CUDA on Windows Subsystem for Linux cuda , wsl	3	2566	April 10, 2023
Can't pass gpu benchmark in Ubuntu18.04 by WSL2 CUDA on Windows Subsystem for Linux	8	2779	May 17, 2021
Wsl2 Ubuntu , docker is not running CUDA on Windows Subsystem for Linux	12	3728	June 25, 2021
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver CUDA on Windows Subsystem for Linux	33	23057	May 1, 2021
Failed to initialize NVML: Unknown Error when running nvidia-smi on Docker container CUDA Programming and Performance cuda , ubuntu , docker	2	10453	October 18, 2020

Yet another "Driver Not Loaded / can't communicate with the NVIDIA driver" error while trying to deploy a docker container with GPU support on WSL2

Related topics