[SOLVED]
Using this guide here:t81_558_deep_learning/tensorflow-install-jul-2020.ipynb at master · jeffheaton/t81_558_deep_learning · GitHub
I finally got my conda environment to detect and use my GPU. Here were the steps I used (don’t know if all of them were necessary, but still):
conda install nb_conda
conda install -c anaconda tensorflow-gpu
conda update cudnn
As a sidenote, it’s a bit of a headscratcher that the various NVidia and TensorFlow guides you can find will tell you things like “don’t install cudnn through Linux if you’re using WSL, you only need the Windows install” or “the latest versions of TensorFlow don’t require you to conda install tensorflow-gpu as it’s now included in the basic tensorflow package”, yet doing just that ended up being the solution (for my situation anyway)! Hope this proves helpful to anyone else!
[ORIGINAL ISSUE]
I’m running the following:
OS: Win10 Pro Insider Preview Build 20241 (latest)
WSL: version 2
Distro: Ubuntu 20.04
GPU: GeForce 970 (CUDA-enabled), CUDA driver v460.20 (latest preview)
Environment: Miniconda
Code editor: Visual Studio Code
Program type: Jupyter Notebook with Python 3.8.5 + TensorFlow library (v. 2.2.0)
I’ve followed your guide for using a GPU in WSL2 and have successfully passed the test for running CUDA Apps: CUDA on WSL :: CUDA Toolkit Documentation
However, when I open a JP Notebook in VS Code in my Conda environment, import TensorFlow and run this:
tf.config.list_physical_devices()
I only get the following output:
[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU'),
PhysicalDevice(name='/physical_device:XLA_CPU:0', device_type='XLA_CPU')]
Other similar functions will indeed confirm that I have 0 GPU. And the calculations I’ve attempted have been slow, so they’re most likely done by the CPU. The GPU is simply not detected.
Likewise, when I run lspci | grep -i nvidia
at the command line, nothing happens, though I figured this might be normal since this is technically a VM if I understand correctly (yes, I’m a bit of a newbie).
Am I doing something wrong?
[EDIT] FYI, I tried to get out of my conda environment and pip install the latest version of TensorFlow (2.3.1). When I do that, tf.config.list_physical_devices()
now gives me this:
[PhysicalDevice(name=‘/physical_device:CPU:0’, device_type=‘CPU’),
PhysicalDevice(name=‘/physical_device:XLA_CPU:0’, device_type=‘XLA_CPU’),
PhysicalDevice(name=‘/physical_device:XLA_GPU:0’, device_type=‘XLA_GPU’)]
and print(device_lib.list_local_devices())
now outputs this:
[name: “/device:CPU:0”
device_type: “CPU”
memory_limit: 268435456
locality {
}
incarnation: 10667357857527575874
, name: “/device:XLA_CPU:0”
device_type: “XLA_CPU”
memory_limit: 17179869184
locality {
}
incarnation: 11863879757054802269
physical_device_desc: “device: XLA_CPU device”
, name: “/device:XLA_GPU:0”
device_type: “XLA_GPU”
memory_limit: 17179869184
locality {
}
incarnation: 12285707176002128760
physical_device_desc: “device: XLA_GPU device”
]
So my GPU seems to be detected on some level with the latest TensorFlow, but it’s unable to find the device name, and still tells me I have 0 GPU available when I run print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU')))
And again, my computations are still run on the CPU, so I don’t think I’m quite there yet.