This is a Lenovo laptop, capable of running Windows 10 22H2 or Windows 11 23H2. Its CPU core is the Intel Core i5-6300HQ, equipped with the Video Adapter Intel(R) HD Graphics 530. Additionally, it boasts the NVIDIA GTX960M GPU
After trying: installed the latest version of the NVIDIA GPU driver and turned on the Windows Subsystem for Linux and Windows Hypervisor Platform features, set the wsl version to wsl2, ran any Linux distribution (such as Ubuntu 22.04) in wsl2, and then ran nvidia-smi in wsl, it would output segmentation fault, which means “C:\Windows\System32\lxss\lib\nvidia-smi” output segmentation fault
(We all know that the path of “C:\Windows\System32\lxss\lib\nvidia-smi” in wsl2 is actually displayed as “/mnt/c/Windows/System32/lxss/lib/nvidia-smi”)
Of course, there is no problem with the Windows version of nvidia-smi.exe
See Also:nvidia-smi segmentation fault in wsl2 but not in Windows · Issue #11277 · microsoft/WSL · GitHub
This is a problem that has not been solved since 2024 (538 and subsequent versions) until now
The 537.96 version (Geforce Vulkan BETA, supports CUDA12.2 at most) works normally. The download link is attached
https://developer.nvidia.com/downloads/vulkan-beta-53796-windows )
To add:I found that only the desktop versions of RTX 3050 and RTX4070 do not have this issue with me. I have asked Linux kernel development engineers, and they said that this issue needs to be fixed by you
The segmentation fault appears to happen in NVML (libnvidia-ml.so.1
):
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.07 Driver Version: 552.55 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff66809a1 in ?? () from /usr/lib/wsl/drivers/nvltwi.inf_amd64_53dae1bddc8c687f/libnvidia-ml.so.1
As a smoke test, I tried generating some text using a HF Transformers language model. It didn’t crash and the output appeared fine.
As long as you don’t need NVML, you might be fine. From stack trace above, it’s not clear whether the problem is on the nvidia-smi
or libnvidia-ml.so.1
side.
The Windows version of nvidia-smi can be run using nvidia-smi.exe
from WSL. You can only see the total GPU utilization, though. The per-process break-down is not displayed:
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 552.55 Driver Version: 552.55 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA T500 WDDM | 00000000:01:00.0 Off | N/A |
| N/A 59C P0 N/A / ERR! | 1563MiB / 4096MiB | 93% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+
Has anybody tried some other NVML-based monitoring tool that worked under the older driver versions?