CUDA works in Windows but not in WSL - how to troubleshoot?

I have Cuda working on Windows 11 (23H2) and have hit a wall trying to get it work on WSL with Ubuntu. I’ve done the following:

Installed geforce experience 3.28.0.417 and NVidia Studio Driver 560.81.

Installed WSL and Ubuntu as described here: CUDA on WSL (nvidia.com)

Installed the Cuda toolkit from here: CUDA Toolkit 12.6 Update 1 Downloads | NVIDIA Developer

When I run nvidia-smi in WSL/Ubuntu it says “NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.”

When I run nvidia-smi.exe in Windows it says:

+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.81                 Driver Version: 560.81         CUDA Version: 12.6     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                  Driver-Model | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3080      WDDM  |   00000000:01:00.0  On |                  N/A |
|  0%   51C    P8             15W /  320W |    1386MiB /  10240MiB |      4%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

So everything seems fine on the Windows side, just not in WSL / Ubuntu.

Under WSL, nvcc --version prints this:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Thu_Nov_18_09:45:30_PST_2021
Cuda compilation tools, release 11.5, V11.5.119
Build cuda_11.5.r11.5/compiler.30672275_0

I’m able to compile an executable that uses Cuda, but when I run it I get this error:

cudaMalloc(&data, size * sizeof(T)) failed with error: no CUDA-capable device is detected

What steps can I take to troubleshoot this problem?