470.14 - WSL with W10 Build 21343 - NVIDIA-SMI error

ah, i already have multiple hours into this. Good to see that i’m not the only one with this problem.
I’m on windows 21364
nvidia-smi (powershel) 470.14 CUDA 11.3 (geforce 2080)
Tried Ubuntu 18.04 and 20.04

4 Likes

Same here:
Windows 21364.100
NVIDIA Driver 470.25 CUDA 11.4 (GeForce GTX 1050 Ti)
Ubuntu 20.04.2 LTS

EDIT: Tried Tensorflow GPU test and that works as well as the Blackscholes sample.

1 Like

Same here

RTX 2080, drivers 470.14
Windows Insider 21364
Ubuntu 18.04

emulcahy@DESKTOP-5C1NA5P:/c/Users/Eogha$ nvidia-smi

NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

Failed to properly shut down NVML: Driver Not Loaded

1 Like

Same on the windows build 21370
wsl2 ubuntu 20.04.2 lts
gtx1080

docker: Error response from daemon: OCI runtime create failed: container_linux.go:367: starting container process caused
: process_linux.go:495: container init caused: Running hook #1:: error running hook: exit status 1, stdout: , stderr: nv
idia-container-cli: initialization error: driver error: failed to process request: unknown.

NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

Failed to properly shut down NVML: Driver Not Loaded

2 Likes

Same problem.

RTX 2080
Driver: 470.14
Windows build: 21370.1
WSL2: 5.4.72-microsoft-standard-WSL2
Ubuntu 20.04.2 LTS

Followed the steps in CUDA on WSL User Guide, including section 7.3. Known Limitations point 2.

Finally some light on this issue. It’s a bug that will be fixed in an upcoming driver.

nvidia-docker 2.6.0-1 - not working on Ubuntu WSL2 · Issue #1496 · NVIDIA/nvidia-docker (github.com)

Docker Desktop still works because it has its own nvidia libraries.

5 Likes

my wsl2 details, that works without issues

  1. uname -a: 5.10.16.3-microsoft-standard-WSL2
  2. Driver Version: 470.14
  3. CUDA Version: 11.3
  4. OS build: 21370.1

And i execute nvidia-smi.exe not nvidia-smi

1 Like

Rajkumarsaswamy:
That is Windows version of nvidia-smi. Not the wsl version.

1 Like

the same problem:
NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

Failed to properly shut down NVML: Driver Not Loaded

how can I fix it?

NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

Running WSL on my Windows 10 machine (not pro). What is the issue?

I reinstall the wsl2 driver 465.21 then everything works well as original.
Given the version of 470.14, I think the solution is to find out the meta-package nvidia-utils-470.

I have the same problem BUT IT IS NOT NVIDIA
IT IS WINDOWS problem because i installed without the driver i can compile
most of the samples but nothing work WHY???
Windows steal the driver from nvidia i am working over ryzen 7 anf nvidia 1660 TI
now windows does not launch wsl with the nvidia but with the internal AMD

I am sure, first i tried to solvee from windows/graphics setting
promote the ubuntu+wsl to higher perfomncee but nothing
happened i tried the NVIDIA setting now wsl /ubuntu crushed ???

The question is how to make wsl see the driver while win10 screens it
any help should be very much appreciated

Hi!
same issue, tried to reinstall windows directly from the insider ISO, tried to remove completely and then re-install wsl … no luck.

where can I find the wsl2 driver 465.21 ? so that I can give it a try …
Thanks!

As nvidia is just not caring about customers (by simply allowing us to get to previous version, that easy) you might try to look for Release Nvidia CUDA 11.3 Driver v465.21 (Win10 DCH) · RainbowMiner/miner-binaries · GitHub

Unfortunately when I install that version the nvidia-smi shows more or less properly (well a kind of ERR! shows in the table)

But when I try to run the simple test:
$ sudo docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark

I got the annoying:
Error: only 0 Devices available, 1 requested. Exiting.

My setup

  • RTX 2060 Super, Driver 465.21
  • Win 21382.1
    -WSL 2, Kernel Ubuntu 18.40, kernel 5.4.72

Anyone succeeded?

You should update your kernel with wsl.exe --update

Dear Sir,

I did that and updated the kernel to 5.10.16. But the error:
Error: only 0 Devices available, 1 requested. Exiting.
Persists.

I’ve seen that the ERR! appearing in the nvidia-smi might be “normal” as in some tutorials it also appears:

I get the very same output (but changing the GPU card reference to RTX 2060)
nvidia-smi output

Anyone giving me a helping hand?

Thanks for the link to the driver. - I installed them and I got a similar behavior.

  • From a WSL Ubuntu shell, running nvidia-smi:
epinux@DESKTOP-TL2DFPU:/mnt/c/Users/massi$ nvidia-smi
Sun May 16 01:59:47 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.00       Driver Version: 465.21       CUDA Version: 11.3     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce GTX 1070    Off  | 00000000:0A:00.0  On |                  N/A |
| 29%   42C    P8    14W / 151W |    618MiB /  8192MiB |    ERR!      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

And running the nvidia-smi.exe (which should run on the windows side):

epinux@DESKTOP-TL2DFPU:/mnt/c/Users/massi$ nvidia-smi.exe
Sun May 16 01:59:50 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 465.21       Driver Version: 465.21       CUDA Version: 11.3     |
|-------------------------------+----------------------+----------------------+
| GPU  Name            TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce GTX 1070   WDDM  | 00000000:0A:00.0  On |                  N/A |
| 29%   42C    P8    14W / 151W |    618MiB /  8192MiB |      1%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      2024    C+G   Insufficient Permissions        N/A      |
|    0   N/A  N/A      3748    C+G   ...8bbwe\WindowsTerminal.exe    N/A      |
|    0   N/A  N/A      3844    C+G   ...ontend\Docker Desktop.exe    N/A      |
|    0   N/A  N/A      4608    C+G   ...me\Application\chrome.exe    N/A      |
|    0   N/A  N/A      5304    C+G   Insufficient Permissions        N/A      |
|    0   N/A  N/A      5940    C+G   C:\Windows\explorer.exe         N/A      |
|    0   N/A  N/A      8088    C+G   ...perience\NVIDIA Share.exe    N/A      |
|    0   N/A  N/A      8152    C+G   ...artMenuExperienceHost.exe    N/A      |
|    0   N/A  N/A     17164    C+G   ...e\Current\LogiOverlay.exe    N/A      |
+-----------------------------------------------------------------------------+
epinux@DESKTOP-TL2DFPU:/mnt/c/Users/massi$ docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
Run "nbody -benchmark [-numbodies=<numBodies>]" to measure performance.
        -fullscreen       (run n-body simulation in fullscreen mode)
        -fp64             (use double precision floating point values for simulation)
        -hostmem          (stores simulation data in host memory)
        -benchmark        (run benchmark to measure performance)
        -numbodies=<N>    (number of bodies (>= 1) to run in simulation)
        -device=<d>       (where d=0,1,2.... for the CUDA device to use)
        -numdevices=<i>   (where i=(number of CUDA devices > 0) to use for simulation)
        -compare          (compares simulation results running once on the default GPU and once on the CPU)
        -cpu              (run n-body simulation on the CPU)
        -tipsy=<file.bin> (load a tipsy model file for simulation)

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

Error: only 0 Devices available, 1 requested.  Exiting.
epinux@DESKTOP-TL2DFPU:/mnt/c/Users/massi$

You may notice the nidia-smi version mismatch:

  • nvidia-smi
| NVIDIA-SMI 470.00       Driver Version: 465.21       CUDA Version: 11.3
  • nvidia-smi.exe
| NVIDIA-SMI 465.21       Driver Version: 465.21       CUDA Version: 11.3

I will try to perform a clean re-install of the whole WSL system and report back.

Tried with ubuntu 18.04 and 16.04 with similar results:
“Error: only 0 Devices available, 1 requested. Exiting.”

I also spotted the different versions as you did, looked around and guessed that it is not very important.

During the nvidia docker install (sudo apt-get install nvidia-docker2) I also got an issue with a symbolic link to libcuda.so.1. I fixed it with mklink in windows host, But I guess that it isn’t really important. if you do a sudo ldconfig you will see the warning for the symbolic link (that disappears with the mklink) but it does not actually fix anything.

I was wondering if the issue is the the libcuda,so driver. maybe any of you have different versions from mine so that I could try a dirty file swapping

Directory of C:\Windows\System32\lxss\lib
16/05/2021 12:53 .
16/12/2020 02:32 133,088 libcuda.so
16/05/2021 12:53 libcuda.so.1 [libcuda.so]
12/05/2021 12:02 785,608 libd3d12.so
12/05/2021 12:02 5,399,104 libd3d12core.so
12/05/2021 12:02 827,904 libdxcore.so
18/03/2021 05:41 6,053,064 libnvcuvid.so.1
18/03/2021 05:41 424,440 libnvidia-encode.so.1
16/12/2020 02:32 192,160 libnvidia-ml.so.1
18/03/2021 05:41 354,808 libnvidia-opticalflow.so.1
16/12/2020 02:32 48,606,768 libnvwgf2umx.so
18/03/2021 05:41 670,104 nvidia-smi

@epifanio Just to make clear that the problems I have are related to the use of nvidia-docker. The GPU is working with the cuda samples from the WSL2 Ubuntu-18.04. I mean when I make (sudo make) the cuda samples from the /usr/local/cuda/samples, and then I try the ./BlackScholes it runs on the GPU (but without any nvidia docker)

Moreover when I try the jupyter notebook example (sudo docker run -it --gpus all -p 8888:8888 tensorflow/tensorflow:latest-gpu-py3-jupyter) it does not show any GPU devices from tensorflow. In any of the examples I add a new cell with the following commands:
import tensorflow as tf
tf.config.list_physical_devices()
=> returns
[PhysicalDevice(name=’/physical_device:CPU:0’, device_type=‘CPU’),
PhysicalDevice(name=’/physical_device:XLA_CPU:0’, device_type=‘XLA_CPU’)]

Really? I got Docker Desktop ver 3.3.3(64133), gpu on docker is not working with following output

docker: Error response from daemon: OCI runtime create failed: container_linux.go:367: starting container process caused: process_linux.go:495: container init caused: Running hook #0:: error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: initialization error: driver error: failed to process request: unknown.

@xinglinqiang Try uninstalling Docker Desktop 3.3.3 and install version 3.3.1 from https://desktop.docker.com/win/stable/amd64/63152/Docker%20Desktop%20Installer.exe?utm_source=docker&utm_medium=webreferral&utm_campaign=docs-driven-download-win-amd64
That’s the only version of Docker Desktop that works for me.