Windows Insider Preview Build 2143.rs_prerelease.210320-1757
Nvidia driver 470.14
GeForce RTX 3090
Using WSL2
Cuda toolkit 11.2
dpkg -l | grep nvidia
ii libnvidia-container-tools 1.3.3-1 amd64 NVIDIA container runtime library (command-line tools)
ii libnvidia-container1:amd64 1.3.3-1 amd64 NVIDIA container runtime library
ii nvidia-container-runtime 3.4.2-1 amd64 NVIDIA container runtime
ii nvidia-container-toolkit 1.4.2-1 amd64 NVIDIA container runtime hook
ii nvidia-docker2 2.5.0-1 all nvidia-docker CLI wrapper
I followed the instructions at: CUDA on WSL :: CUDA Toolkit Documentation with a fresh WSL2 and ubuntu setup, and followed the toolkit installation instructions found here: https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&target_distro=WSLUbuntu&target_version=20&target_type=deblocal except for the last step. Instead of just installing cuda
(my impression was that we didn’t want to do that on WSL?) I ran sudo apt-get install cuda-toolkit-11-2
.
Happy to provide more info. It’s disappointing that this isn’t working since I was hoping to actually do some serious work on my windows install using the new WSL2 docker gpu support.
Main problems:
Running the docker GPU example does not work:
sudo docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
[sudo] password for nivintw:
docker: Error response from daemon: OCI runtime create failed: container_linux.go:367: starting container process caused: process_linux.go:495: container init caused: Running hook #0:: error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: requirement error: unsatisfied condition: cuda>=11.2, please update your driver to a newer version, or use an earlier cuda container: unknown.
ERRO[0000] error waiting for container: context canceled
nvidia-smi does not work (following instructions for coping and updating permissions):
nvidia-smi
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
Failed to properly shut down NVML: Driver Not Loaded
Further info:
I was able to launch the tensorflow container in the documentation and allocate tensors on the GPU, so that container seems to be functional, although the following code took MUCH longer than expected (minutes) to execute:
a = tf.constant([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
b = tf.random.uniform(shape=[3,2])
c = tf.matmul(a, b)
print(c)
Executing op RandomUniform in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Sub in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Mul in device /job:localhost/replica:0/task:0/device:GPU:0
Executing op Add in device /job:localhost/replica:0/task:0/device:GPU:0
tf.Tensor(
[[2.872214 3.5481367]
[6.7865934 8.095843 ]], shape=(2, 2), dtype=float32)