Failure to install CUDA on WSL 2 Ubuntu

I followed all the guide and I can’t get this to work on my system.

cat /proc/version:

Linux version 4.19.121-microsoft-WSL2-standard (oe-user@oe-host) (gcc version 8.2.0 (GCC)) #1 SMP Thu May 14 20:25:24 UTC 2020

nvidia-smi on windows:

C:\Users\david>nvidia-smi
Thu Jun 18 21:19:19 2020

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 455.38       Driver Version: 455.38       CUDA Version: 11.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name            TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1080 TI WDDM  | 00000000:17:00.0 Off |                  N/A |
| 44%   25C    P8     8W / 250W |    137MiB / 11264MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  TITAN RTX          WDDM  | 00000000:65:00.0  On |                  N/A |
| 47%   31C    P8    26W / 280W |   1202MiB / 24576MiB |      1%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    1   N/A  N/A      1480    C+G   Insufficient Permissions        N/A      |
|    1   N/A  N/A      2640    C+G   ...2txyewy\TextInputHost.exe    N/A      |
|    1   N/A  N/A      6644    C+G   C:\Windows\explorer.exe         N/A      |
|    1   N/A  N/A      8232    C+G   ...artMenuExperienceHost.exe    N/A      |
|    1   N/A  N/A      8888    C+G   ...5n1h2txyewy\SearchApp.exe    N/A      |
|    1   N/A  N/A     10680    C+G   ...cw5n1h2txyewy\LockApp.exe    N/A      |
|    1   N/A  N/A     10720    C+G   ...perience\NVIDIA Share.exe    N/A      |
|    1   N/A  N/A     12148    C+G   ...me\Application\chrome.exe    N/A      |
|    1   N/A  N/A     12836    C+G   ...n64\EpicGamesLauncher.exe    N/A      |
|    1   N/A  N/A     13100    C+G   ...zf8qxf38zg5c\SkypeApp.exe    N/A      |
|    1   N/A  N/A     13128    C+G   ...ekyb3d8bbwe\YourPhone.exe    N/A      |
|    1   N/A  N/A     13560    C+G   ...4\UnrealCEFSubProcess.exe    N/A      |
|    1   N/A  N/A     14280    C+G   ...8wekyb3d8bbwe\Cortana.exe    N/A      |
|    1   N/A  N/A     15772    C+G   ...y\ShellExperienceHost.exe    N/A      |
|    1   N/A  N/A     17112    C+G   ...lPanel\SystemSettings.exe    N/A      |
|    1   N/A  N/A     17180    C+G   ...b3d8bbwe\WinStore.App.exe    N/A      |
|    1   N/A  N/A     18564    C+G   ...rograms\Notion\Notion.exe    N/A      |
|    1   N/A  N/A     20944    C+G   ...8bbwe\WindowsTerminal.exe    N/A      |
|    1   N/A  N/A     21372    C+G   ...3d8bbwe\MicrosoftEdge.exe    N/A      |
+-----------------------------------------------------------------------------+

Running ./deviceQuery:

david@DESKTOP-OQMJ66C:~/desenvolvimento/cuda-samples/bin/x86_64/linux/release$ ./deviceQuery
./deviceQuery Starting…

CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 100
→ no CUDA-capable device is detected
Result = FAIL

winver:

20150.1000

Ubuntu is 18.04.

CUDA was installed from https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&target_distro=Ubuntu&target_version=1804&target_type=debnetwork

Any help is appreciated. Thank you!

1 Like

Thank you for reaching out !

Considering what you describe you might have accidentally shadowed the CUDA Driver that WSL is using (which gets automatically mapped in the WSL environment when you install the Windows Display driver 455.38) by the native linux driver from the CUDA toolkit. Some deployment packages of the toolkit come with a native linux driver pre-package.

The easiest way to fix is simply to remove the native display driver that got installed with the toolkit (or just re-do the WSL setup if it sounds easier) and skip the driver install if you decide to install a CUDA toolkit (the .run file for the toolkit should prompt you if you want to install the native linux driver as well). As a side note we are working on safeguards for future CUDA toolkits to make this install process easier.

Let us know if it helps !

I will try to run the .run version. The .deb version did not asked me if I wanted to install the graphic driver.

Any idea how can I get to the default driver?

Thank you for the response.

Ps: I could remove and install the WSL again, but I think will be nice to have the solution for that problem.

Made a sudo apt-get --purge remove nvidia-driver
-450

Used the .run file (unchecking the nvidia driver).

wget http://developer.download.nvidia.com/compute/cuda/11.0.1/local_installers/cuda_11.0.1_450.36.06_linux.runsudo sh cuda_11.0.1_450.36.06_linux.run

And now the error is:

./deviceQuery Starting…

CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 35
→ CUDA driver version is insufficient for CUDA runtime version
Result = FAIL

Thanks !

The WSL CUDA Driver should be mapped in “/usr/lib/wsl/lib/”. Could you verify this is the one that gets picked up ? (running strace on your application and looking for an open call on it would do the trick).

1 Like

/usr/lib/wsl/lib is empty… How can I install it?

First verify that these files are on the host system in “C:\Windows\System32\lxss\lib”.

If it is the case there is a chance WSL was started when you first installed the driver or updated the system. Simply restart WSL by typing in powershell:
wsl --shutdown

And then launch your WSL distro like you normally would.

Also check that your distro is running on WSL 2 by typing in powershell:
wsl -l -v

1 Like

I decided to re-do all the process…

I think the official guide should mention the .run trick to not install the shadow driver.

Here are the steps to install the cuda toolkit on a fresh WSL Ubuntu 18.04 install:

$ sudo apt update

$ sudo apt install build-essentials

$ wget http://developer.download.nvidia.com/compute/cuda/11.0.1/local_installers/cuda_11.0.1_450.36.06_linux.run

$ sudo sh cuda_11.0.1_450.36.06_linux.run

Uncheck the nvidia driver:

Ignore the warning:
WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least .00 is required for CUDA 11.0 functionality to work.
To install the driver using this installer, run the following command, replacing with the name of this run file:
sudo .run --silent --driver

Logfile is /var/log/cuda-installer.log

$ cd /usr/local/cuda-11.0/samples/1_Utilities/deviceQuery

$ make

$ ./deviceQuery

./deviceQuery Starting…

CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 2 CUDA Capable device(s)

Device 0: “TITAN RTX”
CUDA Driver Version / Runtime Version 11.1 / 11.0
CUDA Capability Major/Minor version number: 7.5
Total amount of global memory: 24576 MBytes (25769803776 bytes)
(72) Multiprocessors, ( 64) CUDA Cores/MP: 4608 CUDA Cores
GPU Max Clock rate: 1770 MHz (1.77 GHz)
Memory Clock rate: 7001 Mhz
Memory Bus Width: 384-bit
L2 Cache Size: 6291456 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 1024
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 6 copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Device supports Managed Memory: Yes
Device supports Compute Preemption: Yes
Supports Cooperative Kernel Launch: Yes
Supports MultiDevice Co-op Kernel Launch: Yes
Device PCI Domain ID / Bus ID / location ID: 0 / 101 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

Device 1: “GeForce GTX 1080 Ti”
CUDA Driver Version / Runtime Version 11.1 / 11.0
CUDA Capability Major/Minor version number: 6.1
Total amount of global memory: 11264 MBytes (11811160064 bytes)
(28) Multiprocessors, (128) CUDA Cores/MP: 3584 CUDA Cores
GPU Max Clock rate: 1582 MHz (1.58 GHz)
Memory Clock rate: 5505 Mhz
Memory Bus Width: 352-bit
L2 Cache Size: 2883584 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 5 copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Device supports Managed Memory: Yes
Device supports Compute Preemption: Yes
Supports Cooperative Kernel Launch: Yes
Supports MultiDevice Co-op Kernel Launch: Yes
Device PCI Domain ID / Bus ID / location ID: 0 / 23 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

Peer access from TITAN RTX (GPU0) → GeForce GTX 1080 Ti (GPU1) : No
Peer access from GeForce GTX 1080 Ti (GPU1) → TITAN RTX (GPU0) : No

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 11.1, CUDA Runtime Version = 11.0, NumDevs = 2
Result = PASS

Thank you! I would never guessed that the .run is different than the .deb cuda toolkit.

15 Likes

I’m glad it works for you now.

Yes we should add a note that unlike the .run file that gives you the option to not install the native linux driver, the .deb file install the native driver by default, which is not what you want on WSL.

Again thank you so much for reaching out! And don’t hesitate to post in this forum again if you have other issues or want to give us feedback.

Loved your solution! It works on my PC as well :D

3 Likes

That’s cool. it works on my surface book as well. But not sure why I still can not open GPU in the cuda sample -

docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
docker: Error response from daemon: could not select device driver “” with capabilities: [[gpu]].
ERRO[0000] error waiting for container: context canceled

same here, in fact many steps of the “getting started” [0] failed for me

[0] CUDA on WSL :: CUDA Toolkit Documentation

1 Like

Could you elaborate on what steps failed (or better, first step that failed) for you from the User Guide?
it would help us determine where the problem is.

It would also help if you could attach the docker version you are using.

Sorry, didn’t explained better, couldn’t get the First step in the ‘Install Docker’ instructions

I installed Docker version 19.03.8 and followed the remaining steps.

But once I try to:

$ docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark

It will output:

docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
docker: Error response from daemon: OCI runtime create failed: container_linux.go:349: starting container process caused "process_linux.go:449: container init caused \"process_linux.go:432: running prestart hook 0 caused \\\"error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: initialization error: driver error: failed to process request\\\\n\\\"\"": unknown.
ERRO[0000] error waiting for container: context canceled

Will try the solution marked above written by davidhsv.

This message indicates that our nvidia-docker library fall back onto the native path instead of the WSL one.

There are two main possibilities for this to happen:

  • The Nvidia-Container toolkit you installed is not the one with WSL support (we published specific RC packages that includes WSL support and these are the one we point to on the user guide)
  • GPU support is not enabled. Verify that /dev/dxg is present on your WSL distro (and you can find how to enable everything on the user guide we published)

after I remove docker desktop and use linux docker ce. the issue was gone. but have the issue as @ftuuky. Then I follow this - WSL2: docker: Error response from daemon: cgroups: cannot find cgroup mount destination: unknown. · Issue #4189 · microsoft/WSL · GitHub to resolve the issue. Now the docker can be running. but still have the issue for simulation -

sudo docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
CUDA error at bodysystemcuda_impl.h:159 code=46(cudaErrorDevicesUnavailable) “cudaEventCreate(&m_deviceData[0].event)”
Run “nbody -benchmark [-numbodies=]” to measure performance.
-fullscreen (run n-body simulation in fullscreen mode)
-fp64 (use double precision floating point values for simulation)
-hostmem (stores simulation data in host memory)
-benchmark (run benchmark to measure performance)
-numbodies= (number of bodies (>= 1) to run in simulation)
-device= (where d=0,1,2… for the CUDA device to use)
-numdevices= (where i=(number of CUDA devices > 0) to use for simulation)
-compare (compares simulation results running once on the default GPU and once on the CPU)
-cpu (run n-body simulation on the CPU)
-tipsy=<file.bin> (load a tipsy model file for simulation)

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

Windowed mode
Simulation data stored in video memory
Single precision floating point simulation
1 Devices used for simulation
GPU Device 0: “GeForce GTX 1060” with compute capability 6.1

2 Likes

Please follow the User Guide for CUDA in WSL feature. It is posted here: CUDA on WSL :: CUDA Toolkit Documentation
You will have to remove the components you happened to install while trying the docker desktop.

Maybe I’m confused: besides installing the CUDA driver for Win10 (https://developer.nvidia.com/cuda/wsl/download) do I also need to install the drivers in WSL?

wsl cat proc/version displays:

Linux version 4.19.84-microsoft-standard (oe-user@oe-host) (gcc version 8.2.0 (GCC)) #1 SMP Wed Nov 13 11:44:37 UTC 2019

Name is Ubuntu-20.04, State is Running and Version is 2.

I’ve installed the Nvidia Drivers for CUDA on WSL for GeForce:

455.41_gameready_win10-dch_64bit_international.exe

In C:\Windows\System32\lxss\lib there are 3 files:

  • libcuda.so
  • libcuda.so.1
  • libucda.so.1.1

Everything went OK and restarted the laptop, then followed the guide you mention and everything went fine until

$ docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark

It now displays this error prompt:

docker: Error response from daemon: OCI runtime create failed: container_linux.go:349: starting container process caused "process_linux.go:449: container init caused \"process_linux.go:432: running prestart hook 0 caused \\\"error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: initialization error: driver error: failed to process request\\\\n\\\"\"": unknown.
ERRO[0129] error waiting for container: context canceled

Following @davidhsv post prompted Result = FAIL

What should I do? Uninstall WSL and start a fresh one?

Thanks

No, for WSL you only have to install the Windows Display Driver on the host. No driver install in WSL is needed.

That doesn’t sounds right. Can you look at these steps 1. NVIDIA GPU Accelerated Computing on WSL 2 — CUDA on WSL 12.3 documentation to see if you could get 4.19.121-microsoft-WSL2-standard

Thanks !

1 Like