i notice that after i update nvidia driver in windows, nvidia-smi
version in wsl is not the same as in windows.
in WSL, the driver is updated but nvidia-smi
and nvml
is not. why is that and how to update it?
Windows 10:
NVIDIA-SMI version : 552.12
NVML version : 552.12
DRIVER version : 552.12
CUDA Version : 12.4
WSL2- ubuntu:
NVIDIA-SMI version : 550.73.01
NVML version : 550.73
DRIVER version : 552.12
CUDA Version : 12.4
1 Like
I have the same issue; when I run nvidia-smi it has a seg fault - interestingly when I run nvidia-smi.exe there is no seg fault and versions are all correct:
Windows 11:
NVIDIA-SMI version : 551.86
NVML version : 551.86
DRIVER version : 551.86
CUDA Version : 12.4
WSL2 (Ubuntu):
NVIDIA-SMI version : 550.65
NVML version : 550.65
DRIVER version : 551.86
CUDA Version : 12.4
I have run into the same issue. On Windows side, nvidia-smi
reports 560.86, but in WSL it is 560.34.
Moreover, gdb
backtrace of nvidia-smi
execution shows that the crash is in the driver:
(gdb) r
Starting program: /usr/lib/wsl/lib/nvidia-smi
Downloading separate debug info for system-supplied DSO at 0x7ffff7fc3000
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Downloading separate debug info for /usr/lib/wsl/lib/libnvidia-ml.so.1
Downloading separate debug info for /usr/lib/wsl/drivers/nvhm.inf_amd64_1ddec84e4f6bcc38/libnvidia-ml.so.1
Downloading separate debug info for /usr/lib/wsl/lib/libcuda.so.1
Downloading separate debug info for /usr/lib/wsl/drivers/nvhm.inf_amd64_1ddec84e4f6bcc38/libcuda.so.1.1
Sun Dec 8 14:01:34 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.34 Driver Version: 560.86 CUDA Version: 12.6 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff6886fe1 in ?? () from /usr/lib/wsl/drivers/nvhm.inf_amd64_1ddec84e4f6bcc38/libnvidia-ml.so.1
(gdb) bt
#0 0x00007ffff6886fe1 in ?? () from /usr/lib/wsl/drivers/nvhm.inf_amd64_1ddec84e4f6bcc38/libnvidia-ml.so.1
#1 0x00007ffff6902235 in ?? () from /usr/lib/wsl/drivers/nvhm.inf_amd64_1ddec84e4f6bcc38/libnvidia-ml.so.1
#2 0x00007ffff68906be in ?? () from /usr/lib/wsl/drivers/nvhm.inf_amd64_1ddec84e4f6bcc38/libnvidia-ml.so.1
#3 0x00007ffff6877271 in nvmlDeviceGetDisplayActive ()
from /usr/lib/wsl/drivers/nvhm.inf_amd64_1ddec84e4f6bcc38/libnvidia-ml.so.1
#4 0x0000000000414ec8 in ?? ()