Nvidia-smi can't communicate with driver -- docker-desktop conflict?

I’m trying to get cuda working on a clean WSL2/Ubuntu install on Win11. I’m running nvidia-smi from the ubuntu shell to check gpu health and finding it works immediately after the install but starts failing after that with a “couldn’t communicate with the NVIDIA driver” error.

Although I’ve seen many threads reporting this error message, I haven’t seen reproduction steps or solutions. I think I’ve identified how to repro this problem, which suggests a resource conflict with docker-desktop.

  1. I installed recent nvidia drivers (528.24) for windows
  2. I (re)installed the latest Ubuntu (2204.1.8.0) from Microsoft Store.
  3. I run nvidia-smi from my ubuntu shell, and it successfully finds the gpu.
$ nvidia-smi

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.85.05    Driver Version: 528.24       CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  On   | 00000000:01:00.0  On |                  N/A |
|  0%   38C    P8    10W / 200W |    353MiB /  8192MiB |      1%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A        23      G   /Xwayland                       N/A      |
+-----------------------------------------------------------------------------+```
  1. I run wsl --shutdown from the powershell to bounce WSL, open a new ubuntu shell, and now the command fails:
$ nvidia-smi
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

Failed to properly shut down NVML: Driver Not Loaded

  1. WSL also reports docker desktop running. Note I’m not running any containers, the service just starts in the background.
PS C:\Users\Jim> wsl -l -v
  NAME                   STATE           VERSION
* docker-desktop-data    Stopped         2
  Ubuntu                 Running         2
  docker-desktop         Running         2
  1. I disable the docker desktop service, reboot, and confirm WSL is no longer running it:
PS C:\Users\Jim> wsl -l -v
  NAME                   STATE           VERSION
* docker-desktop-data    Stopped         2
  Ubuntu                 Running         2
  docker-desktop         Stopped         2
  1. Now nvidia-smi starts working reliably. I bring up new shells, shut down and restart wsl, and it consistently sees the gpu.
$ nvidia-smi
Thu Feb  9 13:32:59 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.85.05    Driver Version: 528.24       CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  On   | 00000000:01:00.0  On |                  N/A |
|  0%   38C    P8    12W / 200W |    459MiB /  8192MiB |      1%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Some resource seems to be in contention, with docker grabbing it away from ubuntu, but I don’t understand what. How do I get these to coexist happily?

I’ve met the same problem buddy,that’s really f**king cofussed me.

Microsoft made an update to a similar issue on this github ticket

It seems the issue should be resolved
WSL version Release 1.1.3 · microsoft/WSL · GitHub

We will verify at our end. Could you kindly check you are not running into the same issue and you are unblocked?

Sorry late for replying,as a funny thing is that i turn off the boot option of “ps” which named Run this profile as Administrator then it work.
Don’t know why it works but it really weried,lol.