Failure to install CUDA on WSL 2 Ubuntu

Hello, I’ve followed all the steps in the user guide (https://docs.nvidia.com/cuda/wsl-user-guide/index.html)

but when I try to run docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark it didn’t output anything.

  • I’m using winver 20150
  • Docker Desktop is Disabled
  • Using kernel Linux version 4.19.121-microsoft-WSL2-standard (oe-user@oe-host) (gcc version 8.2.0 (GCC)) #1 SMP Thu May 14 20:25:24 UTC 2020
  • ls /usr/lib/wsl/lib outputs libcuda.so libcuda.so.1 libcuda.so.1.1 libd3d12.so libdirectml.so libdxcore.so

What can I do to debug this?

Did you start Docker service (sudo dockerd) in a separate WSL2 window?

Should I? I started sudo service docker start in the same window as I run docker run

Please start the dockerd deamon in a WSL2 window separate from the WSL2 command window where you plan to run the docker containers.
The blog ( Announcing CUDA on Windows Subsystem for Linux 2) and the User Guide (https://docs.nvidia.com/cuda/wsl-user-guide/index.html) show how to do that.

Okay, I’ve attempted this, running sudo dockerd in one windows and running the docker run in another.

For now I tried running docker run hello-world successfully
but when running the the benchmark, I’ve left it for 30 minutes and still no output.

if I tried docker ps in yet another windows, it shows the following

is it normal to wait that long? or there’re anything that I could check?

EDIT:
There are several Warning when I ran sudo dockerd


does it affects the docker?

It is a bit odd you don’t see any output from the benchmark workload in WSL2. Assuming something went wrong there, it should at least show an error.
When you are running the workload, do you see any GPU activity in the Task Manager on the Windows host?

Also, just to confirm, are you running the latest version of the driver (455.41 from https://developer.nvidia.com/cuda/wsl/download) ?

Is there any chance you could attach screenshots of the entire WSL2 window where the workload runs as well as a text file of the dockerd output from another WSL2 window? (For the privacy reason please feel free to erase the user name in the screenshot)

I had 455.38 installed in my device (I immediately install it after the launch announcement of WSL with CUDA. After I updated it to 455.41 it runs smooth as butter :)

thanks for the assistance @kmorozov!

Great. Thank you for confirming the fix.

works for me. firstly follow nvidia official steps. Then use your solution solved the problem.

Hi,

I got things working by creating a new WSL Distro. But my none of my pre-existing Distros have the driver (/usr/lib/wsl/lib/ is empty) no matter how many times I restart, run wsl --shutdown, etc.

Should it work if I manually symlink the .so files within /usr/lib/wsl/lib/? Or is there any other trick to force my pre-existing Distros to see the driver?

No, it wouldn’t work if you manually add drivers there.

However, there is an OS issue where the lxss libraries don’t get mapped correctly into a distro if you already have other running WSL2 distro’s. While we are waiting on an update in WIP OS with the fix, the workaround is to make sure you are running one distro at the time if you plan to use GPU there.

1 Like

Thanks you so much. I’ve been searching around for this issue for hours before I find out it’s docker desktop to blame.

Awesome this worked for me as well. My deviceQuery did pass, but Pytorch was not detected the CUDA. Installing the .run without the drivers did the trick.

1 Like

Followed the initial instructions (with CUDA-10.2 instead of CUDA 11) and failed:

  • The make step fails for me, but solved via sudo make.

  • However, execution of deviceQuery fails due to driver issues:

 ./deviceQuery
./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 35
-> CUDA driver version is insufficient for CUDA runtime version
Result = FAIL

Any hints? Didn’t install the driver, as the initial text explains.

4 Likes

I was in the same boat and decided to go for cuda 11. Besides, I uninstalled Docker for Windows and executed wsl --shutdown. I got an error, Windows restarted but then the samples worked.

Huh tricky. I strongly need the 10.2 for compatibility w/ another piece of software. Will try restarting and see.

Btw, haven’t installed docker at all (maybe it’s that…) I don’y have a lot of memory in the PC so I’d rather not install it.

Any other ideas?

Having the same issue as most of you had - CUDA driver version is insufficient for CUDA runtime version

Reinstalled Ubuntu distro, installed CUDA via .run file with driver checked off. Reinstalled Windows CUDA WSL driver just in case - still get the error. Libraries aren’t present in /usr/lib/wsl/lib

I was able to run docker nvcr.io/nvidia/k8s/cuda-sample:nbody in previous distro installation, but once I launched docker that requires CUDA 8 I would get CUDA driver version is insufficient for CUDA runtime version error

My Windows build is 19041.450.
Kernel is Linux version 4.19.121-microsoft-standard (oe-user@oe-host) (gcc version 8.2.0 (GCC)) #1 SMP Fri Jun 19 21:06:10 UTC 202
0.
Distro is running on WSL2.

As a side note - I’ve upgraded to Insiders 20150 while giving it a first attempt, however I then found out I don’t actually need that to run GPU docker images on WSL2 (thanks to someone who shared updated kernel on the forums I guess) and thus downgraded - I was still able to run docker GPU images just fine, only issue I had was CUDA driver insufficient error.
However now, with a fresh install I’m getting Container Runtime Initialization Errors.

You can’t downgrade from an insider build without a clean install. What probably happened is you “downgraded” the WSL2 kernel version or rollback to an earlier build but was still running on Insider’s Dev Channel. Now you are in build 19041 which doesn’t support CUDA in WSL2.

I did rollback to an earlier build, that’s correct, and that was non Insider build.
This is a little strange as I could run CUDA dockers just fine in previous distro installation.
Well, I guess I’ll get back to it once it runs on stable build

I solved the

CUDA Device Query (Runtime API) version (CUDART static linking)

problem by installing Geforce Game Ready Driver into windows.