Failure to install CUDA on WSL 2 Ubuntu

I will try to run the .run version. The .deb version did not asked me if I wanted to install the graphic driver.

Any idea how can I get to the default driver?

Thank you for the response.

Ps: I could remove and install the WSL again, but I think will be nice to have the solution for that problem.

Made a sudo apt-get --purge remove nvidia-driver
-450

Used the .run file (unchecking the nvidia driver).

wget http://developer.download.nvidia.com/compute/cuda/11.0.1/local_installers/cuda_11.0.1_450.36.06_linux.runsudo sh cuda_11.0.1_450.36.06_linux.run

And now the error is:

./deviceQuery Starting…

CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 35
-> CUDA driver version is insufficient for CUDA runtime version
Result = FAIL

Thanks !

The WSL CUDA Driver should be mapped in “/usr/lib/wsl/lib/”. Could you verify this is the one that gets picked up ? (running strace on your application and looking for an open call on it would do the trick).

/usr/lib/wsl/lib is empty… How can I install it?

First verify that these files are on the host system in “C:\Windows\System32\lxss\lib”.

If it is the case there is a chance WSL was started when you first installed the driver or updated the system. Simply restart WSL by typing in powershell:
wsl --shutdown

And then launch your WSL distro like you normally would.

Also check that your distro is running on WSL 2 by typing in powershell:
wsl -l -v

1 Like

I decided to re-do all the process…

I think the official guide should mention the .run trick to not install the shadow driver.

Here are the steps to install the cuda toolkit on a fresh WSL Ubuntu 18.04 install:

$ sudo apt update

$ sudo apt install build-essentials

$ wget http://developer.download.nvidia.com/compute/cuda/11.0.1/local_installers/cuda_11.0.1_450.36.06_linux.run

$ sudo sh cuda_11.0.1_450.36.06_linux.run

Uncheck the nvidia driver:

Ignore the warning:
WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least .00 is required for CUDA 11.0 functionality to work.
To install the driver using this installer, run the following command, replacing with the name of this run file:
sudo .run --silent --driver

Logfile is /var/log/cuda-installer.log

$ cd /usr/local/cuda-11.0/samples/1_Utilities/deviceQuery

$ make

$ ./deviceQuery

./deviceQuery Starting…

CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 2 CUDA Capable device(s)

Device 0: “TITAN RTX”
CUDA Driver Version / Runtime Version 11.1 / 11.0
CUDA Capability Major/Minor version number: 7.5
Total amount of global memory: 24576 MBytes (25769803776 bytes)
(72) Multiprocessors, ( 64) CUDA Cores/MP: 4608 CUDA Cores
GPU Max Clock rate: 1770 MHz (1.77 GHz)
Memory Clock rate: 7001 Mhz
Memory Bus Width: 384-bit
L2 Cache Size: 6291456 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 1024
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 6 copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Device supports Managed Memory: Yes
Device supports Compute Preemption: Yes
Supports Cooperative Kernel Launch: Yes
Supports MultiDevice Co-op Kernel Launch: Yes
Device PCI Domain ID / Bus ID / location ID: 0 / 101 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

Device 1: “GeForce GTX 1080 Ti”
CUDA Driver Version / Runtime Version 11.1 / 11.0
CUDA Capability Major/Minor version number: 6.1
Total amount of global memory: 11264 MBytes (11811160064 bytes)
(28) Multiprocessors, (128) CUDA Cores/MP: 3584 CUDA Cores
GPU Max Clock rate: 1582 MHz (1.58 GHz)
Memory Clock rate: 5505 Mhz
Memory Bus Width: 352-bit
L2 Cache Size: 2883584 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 5 copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Device supports Managed Memory: Yes
Device supports Compute Preemption: Yes
Supports Cooperative Kernel Launch: Yes
Supports MultiDevice Co-op Kernel Launch: Yes
Device PCI Domain ID / Bus ID / location ID: 0 / 23 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

Peer access from TITAN RTX (GPU0) -> GeForce GTX 1080 Ti (GPU1) : No
Peer access from GeForce GTX 1080 Ti (GPU1) -> TITAN RTX (GPU0) : No

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 11.1, CUDA Runtime Version = 11.0, NumDevs = 2
Result = PASS

Thank you! I would never guessed that the .run is different than the .deb cuda toolkit.

12 Likes

I’m glad it works for you now.

Yes we should add a note that unlike the .run file that gives you the option to not install the native linux driver, the .deb file install the native driver by default, which is not what you want on WSL.

Again thank you so much for reaching out! And don’t hesitate to post in this forum again if you have other issues or want to give us feedback.

Loved your solution! It works on my PC as well :D

2 Likes

That’s cool. it works on my surface book as well. But not sure why I still can not open GPU in the cuda sample -

docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
docker: Error response from daemon: could not select device driver “” with capabilities: [[gpu]].
ERRO[0000] error waiting for container: context canceled

same here, in fact many steps of the “getting started” [0] failed for me

[0] https://docs.nvidia.com/cuda/wsl-user-guide/index.html#getting-started

1 Like

Could you elaborate on what steps failed (or better, first step that failed) for you from the User Guide?
it would help us determine where the problem is.

It would also help if you could attach the docker version you are using.

Sorry, didn’t explained better, couldn’t get the First step in the ‘Install Docker’ instructions

I installed Docker version 19.03.8 and followed the remaining steps.

But once I try to:

$ docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark

It will output:

docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
docker: Error response from daemon: OCI runtime create failed: container_linux.go:349: starting container process caused "process_linux.go:449: container init caused \"process_linux.go:432: running prestart hook 0 caused \\\"error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: initialization error: driver error: failed to process request\\\\n\\\"\"": unknown.
ERRO[0000] error waiting for container: context canceled

Will try the solution marked above written by davidhsv.

This message indicates that our nvidia-docker library fall back onto the native path instead of the WSL one.

There are two main possibilities for this to happen:

  • The Nvidia-Container toolkit you installed is not the one with WSL support (we published specific RC packages that includes WSL support and these are the one we point to on the user guide)
  • GPU support is not enabled. Verify that /dev/dxg is present on your WSL distro (and you can find how to enable everything on the user guide we published)

after I remove docker desktop and use linux docker ce. the issue was gone. but have the issue as @ftuuky. Then I follow this - https://github.com/microsoft/WSL/issues/4189 to resolve the issue. Now the docker can be running. but still have the issue for simulation -

sudo docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
CUDA error at bodysystemcuda_impl.h:159 code=46(cudaErrorDevicesUnavailable) “cudaEventCreate(&m_deviceData[0].event)”
Run “nbody -benchmark [-numbodies=]” to measure performance.
-fullscreen (run n-body simulation in fullscreen mode)
-fp64 (use double precision floating point values for simulation)
-hostmem (stores simulation data in host memory)
-benchmark (run benchmark to measure performance)
-numbodies= (number of bodies (>= 1) to run in simulation)
-device= (where d=0,1,2… for the CUDA device to use)
-numdevices= (where i=(number of CUDA devices > 0) to use for simulation)
-compare (compares simulation results running once on the default GPU and once on the CPU)
-cpu (run n-body simulation on the CPU)
-tipsy=<file.bin> (load a tipsy model file for simulation)

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

Windowed mode
Simulation data stored in video memory
Single precision floating point simulation
1 Devices used for simulation
GPU Device 0: “GeForce GTX 1060” with compute capability 6.1

2 Likes

Please follow the User Guide for CUDA in WSL feature. It is posted here: https://docs.nvidia.com/cuda/wsl-user-guide/index.html
You will have to remove the components you happened to install while trying the docker desktop.

Maybe I’m confused: besides installing the CUDA driver for Win10 (https://developer.nvidia.com/cuda/wsl/download) do I also need to install the drivers in WSL?

wsl cat proc/version displays:

Linux version 4.19.84-microsoft-standard (oe-user@oe-host) (gcc version 8.2.0 (GCC)) #1 SMP Wed Nov 13 11:44:37 UTC 2019

Name is Ubuntu-20.04, State is Running and Version is 2.

I’ve installed the Nvidia Drivers for CUDA on WSL for GeForce:

455.41_gameready_win10-dch_64bit_international.exe

In C:\Windows\System32\lxss\lib there are 3 files:

  • libcuda.so
  • libcuda.so.1
  • libucda.so.1.1

Everything went OK and restarted the laptop, then followed the guide you mention and everything went fine until

$ docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark

It now displays this error prompt:

docker: Error response from daemon: OCI runtime create failed: container_linux.go:349: starting container process caused "process_linux.go:449: container init caused \"process_linux.go:432: running prestart hook 0 caused \\\"error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: initialization error: driver error: failed to process request\\\\n\\\"\"": unknown.
ERRO[0129] error waiting for container: context canceled

Following @davidhsv post prompted Result = FAIL

What should I do? Uninstall WSL and start a fresh one?

Thanks

No, for WSL you only have to install the Windows Display Driver on the host. No driver install in WSL is needed.

That doesn’t sounds right. Can you look at these steps https://docs.nvidia.com/cuda/wsl-user-guide/index.html#installing-wsl2 to see if you could get 4.19.121-microsoft-WSL2-standard

Thanks !

1 Like

Just finished following all the steps from Microsoft (have the latest Windows 10 20H2 and re-downloaded WLS2 from here and wsl cat /proc/version/ displays:

Linux version 4.19.104-microsoft-standard (oe-user@oe-host) (gcc version 8.2.0 (GCC)) #1 SMP Wed Feb 19 06:37:35 UTC 2020

Don’t know how to have the .121 version and doesn’t say WSL2-standard but everything says it’s running WSL2.
Any attempt to remove Legacy versions says they don’t exist.

In any case, installed a fresh Ubuntu (Ubuntu 18.04 LTS just like the guide) and have the same error:

docker: Error response from daemon: OCI runtime create failed: container_linux.go:349: starting container process caused "process_linux.go:449: container init caused \"process_linux.go:432: running prestart hook 0 caused \\\"error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: initialization error: driver error: failed to process request\\\\n\\\"\"": unknown.
ERRO[0129] error waiting for container: context canceled

I can’t figure out what I’m doing wrong. Thank you for your time on this.

edit: disregard my post, I needed to change Windows Insider to ‘Fast’ instead of ‘Slow’ and obtain Windows 10 build number 20150 in order to obtain the 4.19.121-microsoft-WSL2-standard.

-F

No worries,

We actually need to you to be on 4.19.121 or higher to get GPU acceleration (you will see /dev/dxg in your WSL distro when you will be on the right kernel).

The easiest way to get the new linux kernel is via windows update as indicated the guideGuide .

Couple of thing you can double check, to see why you don’t have the latest WSL2 Kernel:

  • Verify you are on the insider fast ring
  • Verify that your windows build is 20150 (or higher)
  • Shutdown wsl before doing your updates (in powershell type “wsl --shutdow”). Just to be sure
  • In the Settings apps in the Windows update section (You can access it by searching for Windows Update in your start menu)
    – Got to “Advanced options” And make sure “Receive updates for other Microsoft products when you update Windows” Is set to ON.
    – Click on “check for updates” Again
    – Go to View Update History. You should see “Windows Subsystem for Linux Update - 4.19.121”

Unfortunately there is nothing we could do until you get the updated kernel. This is a key component in enabling GPU acceleration.

1 Like