Hiccups setting up WSL2 + CUDA

lefnire · June 19, 2020, 4:38am

I followed the instructions at CUDA on WSL :: CUDA Toolkit Documentation. First issue: using Docker Desktop for Windows didn’t work (I got “no [[gpu]]” -ish errors, can’t remember), I had to disable DD’s WSL2 integration, close it (set to not start with system), re-install Ubuntu-18.04, install Docker manually in WSL2 via get.docker.com. Don’t know if there’s a downside to Docker in WSL2 vs DD hooked to WSL2, but

Question 1: will this be working with Docker Desktop for Windows in the end?

Next, NVIDIA/nvidia-docker/README differs from wsl-user-guide. In particular apt-get install nvidia-container-toolkit vs apt-get install -y nvidia-docker2. I understand the former replaces the latter (deprecated)? Any insights there? I’m sticking to wsl-user-guide since I’ve got it working, but actually README was linked to via GPU acceleration in WSL | Microsoft Docs so that sent me down a rabbit hole. So

Question 2: should we be using nvidia-docker2 or nvidia-container-toolkit? Perchance update README to point Windows users to user-guide? (Like “some of these packages will be different / still using the deprecated for Windows users, click here”)

Lastly… actually I think I’ve realized just now that NVIDIA/nvidia-docker/README just isn’t caught-up for Windows users yet and maybe y’all are holding out till out of preview or such. What I was gonna say was docker run --gpus all nvidia/cuda:10.0-base nvidia-smi doesn’t work, though docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark from wsl-user-guide does. So I suppose this is just extension of part 2.

P_Ramarao · June 19, 2020, 6:01am

hi lefnire,

Docker Desktop WSL 2 backend is not supported yet with GPUs. You will have to install Docker as you would traditionally in Linux for WSL 2 and then install NVIDIA Container Toolkit (or nvidia-docker2) for now.
nvidia-container-toolkit and nvidia-docker2 in the end are just wrappers. There is a slight variation depending on which version of Docker you use (19.03 vs. 18.09), but if you chose to install nvidia-docker2, then that works across both releases of Docker. I’ll look into making that more clear in the documentation.
nvidia-smi does not work because we don’t support NVML in WSL 2 yet - this is part of the Known Limitations in the user-guide. We will be adding support for it in the near future.

GRL · June 19, 2020, 6:46am

Hi :)
Regarding #3 - is there another way of tracking device utilization until nvidia-smi support is added? (PyTorch in my case)
Thanks

kmorozov · June 19, 2020, 6:56pm

Task Manager on Windows host will show the GPU utilization if that would work for you.

patricksnape · June 19, 2020, 7:13pm

The docs should mention that the WSL 2 backend for Docker Desktop is not supported to make this clear.

tommywu052 · June 19, 2020, 9:47pm

Hi , after I use the WSL get.docker.com , I can run the container. but I have the following issue for CUDA error at bodysystemcuda_xxxx , How do you resolve this ?

sudo docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
CUDA error at bodysystemcuda_impl.h:159 code=46(cudaErrorDevicesUnavailable) “cudaEventCreate(&m_deviceData[0].event)”
Run “nbody -benchmark [-numbodies=]” to measure performance.
-fullscreen (run n-body simulation in fullscreen mode)
-fp64 (use double precision floating point values

kmorozov · June 19, 2020, 9:53pm

Please do not use the Docker install from get.docker.com.
You would need to remove all components you added this way to your WSL 2 container.

After that you can follow the User Guide (CUDA on WSL :: CUDA Toolkit Documentation) to install the runtime correctly.

tommywu052 · June 19, 2020, 10:14pm

I think the user guide say -
curl https://get.docker.com | sh , Am I wrong to install docker ?

kmorozov · June 19, 2020, 10:17pm

I am sorry, I misread your message and thought you installed it via Docker’s script.

Could you check if GPU device is supported by your WSL 2 container? Check if /dev/dxg folder is there.

tommywu052 · June 19, 2020, 10:21pm

tommywu@DESKTOP-5RK65D0:/mnt/c/Users/towu$ ls /dev/dxg
/dev/dxg
tommywu@DESKTOP-5RK65D0:/mnt/c/Users/towu$ sudo apt list | grep libnvidia-container

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

libnvidia-container-dev/bionic 1.2.0~rc.2-1 amd64
libnvidia-container-tools/bionic,now 1.2.0~rc.2-1 amd64 [installed,automatic]
libnvidia-container1/bionic,now 1.2.0~rc.2-1 amd64 [installed,automatic]
libnvidia-container1-dbg/bionic 1.2.0~rc.2-1 amd64
tommywu@DESKTOP-5RK65D0:/mnt/c/Users/towu$ sudo apt list | grep nvidia-docker

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

nvidia-docker2/bionic,now 2.3.0-1 all [installed]
tommywu@DESKTOP-5RK65D0:/mnt/c/Users/towu$ sudo docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
CUDA error at bodysystemcuda_impl.h:159 code=46(cudaErrorDevicesUnavailable) “cudaEventCreate(&m_deviceData[0].event)”
Run “nbody -benchmark [-numbodies=]” to measure performance.
-fullscreen (run n-body simulation in fullscreen mode)
-fp64 (use double precision floating point values for simulation)
-hostmem (stores simulation data in host memory)
-benchmark (run benchmark to measure performance)
-numbodies= (number of bodies (>= 1) to run in simulation)
-device= (where d=0,1,2… for the CUDA device to use)
-numdevices= (where i=(number of CUDA devices > 0) to use for simulation)
-compare (compares simulation results running once on the default GPU and once on the CPU)
-cpu (run n-body simulation on the CPU)
-tipsy=<file.bin> (load a tipsy model file for simulation)

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

Windowed mode
Simulation data stored in video memory
Single precision floating point simulation
1 Devices used for simulation
GPU Device 0: “GeForce GTX 1060” with compute capability 6.1

Compute 6.1 CUDA device: [GeForce GTX 1060]
tommywu@DESKTOP-5RK65D0:/mnt/c/Users/towu$ sudo docker -v
Docker version 19.03.11, build 42e35e61f3

rboissel · June 19, 2020, 10:58pm

Could you run dxdiag on your machine (in the host Windows not in WSL) and share the results here. There should be a button on the dxdiag interface to save the results in a file you can post here.

Thanks in advance !

tommywu052 · June 20, 2020, 3:27am

DxDiag.txt (128.2 KB) FYI.

rboissel · June 20, 2020, 3:32am

Thanks !

I see that your driver is 455.38. If you are experiencing a crash or a hang of the program it could be related to some issues that were fixed in this morning updated package. Could you download the latest driver here and see if it helps. (Just to be sure run wsl --shutdown in powershell before updating)

tommywu052 · June 20, 2020, 3:59am

oh, Yes , This version driver fixed my issue, Thanks !

tanweer.ali1 · November 25, 2020, 9:37pm

I’m having a similar problem getting my wsl-2 to talk with my GPU properly.

I also followed first the instrunctions at: CUDA on WSL :: CUDA Toolkit Documentation to install cuda-toolkit inside wsl2

I don’t see a /dev/dxg folder created

The result of running the ./Blackscholes example is like:
[./BlackScholes] - Starting…
CUDA error at …/…/common/inc/helper_cuda.h:777 code=35(cudaErrorInsufficientDriver) “cudaGetDeviceCount(&device_count)”

I also tried the following steps to install the cuda-toolkit from another thread:

sudo apt update
sudo apt-get install build-essential
wget http://developer.download.nvidia.com/compute/cuda/11.0.1/local_installers/cuda_11.0.1_450.36.06_linux.run
sudo sh cuda_11.0.1_450.36.06_linux.run

same results.

However if i go to windows-powershell and run the nvidia-smi.exe command,
i see the output like:

luizjosebp · January 20, 2021, 7:29pm

I’ve run into exactly the same problem as tanweer.ali1 !

I will try rboissel solution now and see if it fixes my problem

sebastian.kraszewski · May 18, 2021, 10:27pm

Hi, one year after NVLM is still not present in WSL2… Apparently near future is in pandemic mist.

Seb

gurveshsanghera · May 19, 2021, 8:12am

Hi there - please see my post : Guide to run CUDA + WSL + Docker with latest versions (21382 Windows build + 470.14 Nvidia)

Most of the guides are now obsolete since Docker is supporting GPU in Windows via their own WSL2 integration. The guide goes into the steps to get it working correctly.

All the best!

miteshyh1 · May 22, 2021, 6:35pm

Hi There,

Is NVML now supported under WSL2?

Topic		Replies	Views
Wsl2 Ubuntu , docker is not running CUDA on Windows Subsystem for Linux	12	3841	June 25, 2021
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver CUDA on Windows Subsystem for Linux	33	23404	May 1, 2021
Trouble with detecting GPU in WSL2 CUDA on Windows Subsystem for Linux	3	3561	June 17, 2021
Not able to run containers under CUDA on WSL 2 CUDA on Windows Subsystem for Linux	0	725	May 14, 2021
WSL2 backed docker containers can't see GPU's CUDA on Windows Subsystem for Linux docker	1	1769	June 4, 2024
Failure to install CUDA on WSL 2 Ubuntu CUDA on Windows Subsystem for Linux	65	47633	September 10, 2021
Docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]] Linux	5	14686	May 11, 2021
Yet another "Driver Not Loaded / can't communicate with the NVIDIA driver" error while trying to deploy a docker container with GPU support on WSL2 CUDA on Windows Subsystem for Linux	11	5726	May 9, 2021
Guide to run CUDA + WSL + Docker with latest versions (21382 Windows build + 470.14 Nvidia) CUDA on Windows Subsystem for Linux cuda , wsl	22	35494	December 9, 2023
WSL2: docker: Error response from daemon: CUDA on Windows Subsystem for Linux	7	9500	May 2, 2021

Hiccups setting up WSL2 + CUDA

Related topics