You should double check as well that you didn’t accidentally installed a native linux driver on your WSL system. The fact that you have the nvidia-smi binary is a bit suspicious for that matter.
If you install a native driver (either directly or indirectly by using a toolkit package for instance that comes with the native driver as a dependency) you will shadow the real WSL and your apps will pick the wrong driver. There are a couple of post on that topic.
I tried to uninstall all pytorch related package and install cpu-only version as it will include cuda-toolkit in gpu version.
After that I just tried this example
docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
I have both toolkits 10.1 and 11 installed so that’s not the problem. As rboissel said you probably have the linux driver installed, although it doesn’t look like it was installed via apt install. Either way you will need to uninstall it manually with sudo nvidia-uninstall or similar.
I just tried on a brand new Ubuntu18.04 install in WSL2 and docker nbody works just fine.
I previously installed nvidia driver for windows system and I could call nvidia-smi on my windows terminal. Do I need to uninstall this, otherwise I really can’t recall I’ve installed a linux driver.
You don’t need to uninstall the windows driver. The problem is inside your Ubuntu install since nvidia-smi is installed with the nvidia Linux driver.
What’s the output of dpkg -S $(which nvidia-smi) ?
About the docker error, do you happen to have Docker for Desktop installed in Windows?
That tutorial is not for cuda but for DirectML and it should work in its context (conda activate directml)
>>> tf.config.experimental.list_physical_devices('GPU')
2020-09-11 23:11:15.973596: I tensorflow/core/common_runtime/dml/dml_device_factory.cc:45] DirectML device enumeration: found 1 compatible adapters.
Your Ubuntu install is right now in an unpredictable state. If you want to see CUDA running in WSL2 then you should uninstall Ubuntu and reinstall again.
The way I’m downloading Ubuntu is through this link Manual installation steps for older versions of WSL | Microsoft Learn
Since I don’t want to install in system drive, I changed the file extension from ‘.appx’ to ‘.zip’ and unzip to my working drive (D:). From there, I could just click the ‘ubuntu1804.exe’ file. After that I just installed everything liked the cuda wsl tutorial.
Could there be anything wrong in this process?
PS: My wsl version is now:
C:\Users\chest>wsl cat /proc/version
Linux version 4.19.128-microsoft-standard (oe-user@oe-host) (gcc version 8.2.0 (GCC)) #1 SMP Tue Jun 23 12:58:10 UTC 2020
That’s just fine, I also have my distro installed in another partition. If you don’t mind deleting all things inside the Ubuntu distro you only need to do this:
Open cmd.exe and run wsl.exe --unregister Ubuntu-18.04
That will delete the ext4.vhdx file that contains the distro. Next time you double click ubuntu1804.exe will install anew.
Well, now at least you can tell me what exactly steps you did until you got there. Use history and post all the commands you did enter from a brand new installed Ubuntu 18.04 and I’ll try to reproduce it.