CUDA Error: Device Unavailable

I’m trying to get CUDA working on Ubuntu with WSL2. I followed the guide here very carefully and from a fresh install, but I’m still running into issues. I feel very sure that I have the proper version of WSL, the Microsoft Windows Insiders Program build, and the driver (installed on Windows not WSL). I used the solution from this post to install the CUDA toolkit without the driver.

The issue I’m running into is that when I run the BlackScholes sample or docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark I get the following error which causes them both to fail:

Docker:

> Windowed mode
> Simulation data stored in video memory
> Single precision floating point simulation
> 1 Devices used for simulation
GPU Device 0: "GeForce GTX 770" with compute capability 3.0

> Compute 3.0 CUDA device: [GeForce GTX 770]
CUDA error at bodysystemcuda_impl.h:159 code=46(cudaErrorDevicesUnavailable) "cudaEventCreate(&m_deviceData[0].event)"

BlackScholes:

[./BlackScholes] - Starting...
GPU Device 0: "Kepler" with compute capability 3.0

Initializing data...
...allocating CPU memory for options.
...allocating GPU memory for options.
CUDA error at BlackScholes.cu:116 code=46(cudaErrorDevicesUnavailable) "cudaMalloc((void **)&d_CallResult, OPT_SZ)"

Any thoughts on what might be causing this?

For further context, here’s what happens when I try to run the classification Jupyter notebook from the example in the CUDA on WSL walkthrough. This line seems particularly interesting though I’m not entirely sure what to make of it:
Ignoring visible gpu device (device: 0, name: GeForce GTX 770, pci bus id: 0000:23:00.0, compute capability: 3.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.

Full output:

[I 01:24:10.969 NotebookApp] Kernel started: 9130b152-f94e-4500-887a-3d134a523ee0
2020-07-01 01:24:14.498882: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer.so.6
2020-07-01 01:24:14.500040: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer_plugin.so.6
2020-07-01 01:24:19.026519: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-07-01 01:24:19.252813: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:23:00.0/numa_node
Your kernel may have been built without NUMA support.
2020-07-01 01:24:19.253031: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:23:00.0 name: GeForce GTX 770 computeCapability: 3.0
coreClock: 1.189GHz coreCount: 8 deviceMemorySize: 2.00GiB deviceMemoryBandwidth: 208.91GiB/s
2020-07-01 01:24:19.253085: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-07-01 01:24:19.253139: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-07-01 01:24:19.254302: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-07-01 01:24:19.254502: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-07-01 01:24:19.255435: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-07-01 01:24:19.255916: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-07-01 01:24:19.255979: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-07-01 01:24:19.256528: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:23:00.0/numa_node
Your kernel may have been built without NUMA support.
2020-07-01 01:24:19.257211: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:967] could not open file to read NUMA node: /sys/bus/pci/devices/0000:23:00.0/numa_node
Your kernel may have been built without NUMA support.
2020-07-01 01:24:19.257466: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1651] Ignoring visible gpu device (device: 0, name: GeForce GTX 770, pci bus id: 0000:23:00.0, compute capability: 3.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
2020-07-01 01:24:19.257969: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-07-01 01:24:19.263920: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3600015000 Hz
2020-07-01 01:24:19.265878: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x60c0bc0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-07-01 01:24:19.265912: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-07-01 01:24:19.321836: W tensorflow/compiler/xla/service/platform_util.cc:276] unable to create StreamExecutor for CUDA:0: failed initializing StreamExecutor for CUDA device ordinal 0: Internal: failed call to cuDevicePrimaryCtxRetain: CUDA_ERROR_UNKNOWN: unknown error
2020-07-01 01:24:19.321973: I tensorflow/compiler/jit/xla_gpu_device.cc:136] Ignoring visible XLA_GPU_JIT device. Device number is 0, reason: Internal: no supported devices found for platform CUDA
2020-07-01 01:24:19.322172: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-07-01 01:24:19.322207: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102]

The error message is telling you the TensorFlow requires the GPU compute capability level 3.5 or higher.

Clearly the GPU you have is only 3.0:

GPU Device 0: “GeForce GTX 770” with compute capability 3.0

Thanks. I guess what I’m really wondering is whether I can downgrade CUDA to remedy the situation and whether or not that will have implications for compatibility with WSL. Or if this is something that can only be solved by upgrading my graphics card? I should have stated that in my follow-up post, sorry.

I suspect that the compute capability issue is what is underlying the cudaErrorDevicesUnavailable errors from the other two examples, but I only saw it with TensorFlow so I wasn’t sure if it was a separate issue or not.