Hiccups setting up WSL2 + CUDA

I followed the instructions at https://docs.nvidia.com/cuda/wsl-user-guide/index.html. First issue: using Docker Desktop for Windows didn’t work (I got “no [[gpu]]” -ish errors, can’t remember), I had to disable DD’s WSL2 integration, close it (set to not start with system), re-install Ubuntu-18.04, install Docker manually in WSL2 via get.docker.com. Don’t know if there’s a downside to Docker in WSL2 vs DD hooked to WSL2, but

Question 1: will this be working with Docker Desktop for Windows in the end?

Next, NVIDIA/nvidia-docker/README differs from wsl-user-guide. In particular apt-get install nvidia-container-toolkit vs apt-get install -y nvidia-docker2. I understand the former replaces the latter (deprecated)? Any insights there? I’m sticking to wsl-user-guide since I’ve got it working, but actually README was linked to via https://docs.microsoft.com/en-us/windows/win32/direct3d12/gpu-accelerated-training so that sent me down a rabbit hole. So

Question 2: should we be using nvidia-docker2 or nvidia-container-toolkit? Perchance update README to point Windows users to user-guide? (Like “some of these packages will be different / still using the deprecated for Windows users, click here”)

Lastly… actually I think I’ve realized just now that NVIDIA/nvidia-docker/README just isn’t caught-up for Windows users yet and maybe y’all are holding out till out of preview or such. What I was gonna say was docker run --gpus all nvidia/cuda:10.0-base nvidia-smi doesn’t work, though docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark from wsl-user-guide does. So I suppose this is just extension of part 2.

hi lefnire,

  1. Docker Desktop WSL 2 backend is not supported yet with GPUs. You will have to install Docker as you would traditionally in Linux for WSL 2 and then install NVIDIA Container Toolkit (or nvidia-docker2) for now.
  2. nvidia-container-toolkit and nvidia-docker2 in the end are just wrappers. There is a slight variation depending on which version of Docker you use (19.03 vs. 18.09), but if you chose to install nvidia-docker2, then that works across both releases of Docker. I’ll look into making that more clear in the documentation.
  3. nvidia-smi does not work because we don’t support NVML in WSL 2 yet - this is part of the Known Limitations in the user-guide. We will be adding support for it in the near future.
1 Like

Hi :)
Regarding #3 - is there another way of tracking device utilization until nvidia-smi support is added? (PyTorch in my case)
Thanks

Task Manager on Windows host will show the GPU utilization if that would work for you.

1 Like

The docs should mention that the WSL 2 backend for Docker Desktop is not supported to make this clear.

2 Likes

Hi , after I use the WSL get.docker.com , I can run the container. but I have the following issue for CUDA error at bodysystemcuda_xxxx , How do you resolve this ?

sudo docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
CUDA error at bodysystemcuda_impl.h:159 code=46(cudaErrorDevicesUnavailable) “cudaEventCreate(&m_deviceData[0].event)”
Run “nbody -benchmark [-numbodies=]” to measure performance.
-fullscreen (run n-body simulation in fullscreen mode)
-fp64 (use double precision floating point values

Please do not use the Docker install from get.docker.com.
You would need to remove all components you added this way to your WSL 2 container.

After that you can follow the User Guide (https://docs.nvidia.com/cuda/wsl-user-guide/index.html) to install the runtime correctly.

I think the user guide say -
curl https://get.docker.com | sh , Am I wrong to install docker ?

I am sorry, I misread your message and thought you installed it via Docker’s script.

Could you check if GPU device is supported by your WSL 2 container? Check if /dev/dxg folder is there.

tommywu@DESKTOP-5RK65D0:/mnt/c/Users/towu$ ls /dev/dxg
/dev/dxg
tommywu@DESKTOP-5RK65D0:/mnt/c/Users/towu$ sudo apt list | grep libnvidia-container

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

libnvidia-container-dev/bionic 1.2.0~rc.2-1 amd64
libnvidia-container-tools/bionic,now 1.2.0~rc.2-1 amd64 [installed,automatic]
libnvidia-container1/bionic,now 1.2.0~rc.2-1 amd64 [installed,automatic]
libnvidia-container1-dbg/bionic 1.2.0~rc.2-1 amd64
tommywu@DESKTOP-5RK65D0:/mnt/c/Users/towu$ sudo apt list | grep nvidia-docker

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

nvidia-docker2/bionic,now 2.3.0-1 all [installed]
tommywu@DESKTOP-5RK65D0:/mnt/c/Users/towu$ sudo docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
CUDA error at bodysystemcuda_impl.h:159 code=46(cudaErrorDevicesUnavailable) “cudaEventCreate(&m_deviceData[0].event)”
Run “nbody -benchmark [-numbodies=]” to measure performance.
-fullscreen (run n-body simulation in fullscreen mode)
-fp64 (use double precision floating point values for simulation)
-hostmem (stores simulation data in host memory)
-benchmark (run benchmark to measure performance)
-numbodies= (number of bodies (>= 1) to run in simulation)
-device= (where d=0,1,2… for the CUDA device to use)
-numdevices= (where i=(number of CUDA devices > 0) to use for simulation)
-compare (compares simulation results running once on the default GPU and once on the CPU)
-cpu (run n-body simulation on the CPU)
-tipsy=<file.bin> (load a tipsy model file for simulation)

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

Windowed mode
Simulation data stored in video memory
Single precision floating point simulation
1 Devices used for simulation
GPU Device 0: “GeForce GTX 1060” with compute capability 6.1

Compute 6.1 CUDA device: [GeForce GTX 1060]
tommywu@DESKTOP-5RK65D0:/mnt/c/Users/towu$ sudo docker -v
Docker version 19.03.11, build 42e35e61f3

Could you run dxdiag on your machine (in the host Windows not in WSL) and share the results here. There should be a button on the dxdiag interface to save the results in a file you can post here.

Thanks in advance !

DxDiag.txt (128.2 KB) FYI.

Thanks !

I see that your driver is 455.38. If you are experiencing a crash or a hang of the program it could be related to some issues that were fixed in this morning updated package. Could you download the latest driver here and see if it helps. (Just to be sure run wsl --shutdown in powershell before updating)

1 Like

oh, Yes , This version driver fixed my issue, Thanks !