I have two GPUs and they correctly showed up in the Nvidia Control Panel:
However, when I was testing in Pytorch/tensorflow in WSL2, only one is recognized:
Code:
docker run -it --gpus all tensorflow/tensorflow:latest-gpu
tf.config.list_physical_devices(‘GPU’)
Output:
Adding visible gpu devices: 0
[PhysicalDevice(name=‘/physical_device:GPU:0’, device_type=‘GPU’)]
FYI:
Driver: 455.41
WSL2: 4.19.121-microsoft-standard
Thank you!
Hello,
Is it only happening in docker, or do you face the same issues with apps running directly on your distro without docker involved ?
Thanks,
And could you also run dxdiag tool, save the results to a file and attach it here for review?
The issue is observed both with and without docker.
Please see here for the dxdiag results: DxDiag.txt (126.0 KB)
Thank you!
Thanks for the log file !
Nothing looks wrong from the Windows host perspective. Can you verify that /dev/dxg is present in your distribution and that /usr/lib/wsl/lib/ contains libcuda.so.1 ?
If none are present, then you should check your kernel version and verify that you are on 4.19.121-microsoft-standard (or above).
If they are present, then you might have accidentally installed a linux native driver (for instance if you installed a CUDA toolkit via the full .deb file).
If you see the /dev/dxg but no CUDA libraries in the WSL folders then there might be some deeper setup issues.
Thanks,
Thank you for the prompt reply!
I checked and verified that both /dev/dxg
and /usr/lib/wsl/lib
exist and my WSL version is 4.19.121-microsoft-standard. I even reinstall WSL to make sure no other CUDA toolkit is installed other than the win_455.41. Unfortunately, the issue still persists.
One thing to note is that the /dev/dxg file is empty. Is it normal?
Yes this is a device not a normal file so you will not find any content in it.
So just to confirm:
- You do see libcuda.so.1 in /usr/lib/wsl/lib
- If you run
apt list --installed
you don’t see any display driver in the list that might have been accidentally installed ?
I double-checked and confirmed that libcuda.so.1
is in /usr/lib/wsl/lib
. I also observed similar files libcuda.so.1.1
and libcuda.so
, not sure if they would help.
I ran apt list --installed
and there seems no display driver accidentally installed, but anyway I attached the result and hope you may find any unusual here (I’m quite new to Linux).installed-apt.txt (36.9 KB)
Thank you!
– update Jun 26
To enable CUDA on pytorch, I have to install conda install cudatoolkit==10.2
(or 10.1). However, judging from the NV control panel the CUDA version from the driver is 11. Will this create a conflit?
– update Jun 27
I solved this issue by disabling SLI on the Windows host per this WSL2 issue: When SLI is enabled on the Windows Host, only 1 GPU shows up in WSL. Not sure the problem is on who’s side, but expect that it could be fixed in the near future.
When SLI is enabled on the HOST, only one (logical) GPU will be exposed to the Linux Guest.
This is how currently OS GPU paravirtualization support works on Windows 10.
So if you want to get multiple GPUs in the VM Guest, you’d want to disable the SLI on the HOST.