GeForce GTX960M nvidia-smi segmentation fault in wsl2

This is a Lenovo laptop, capable of running Windows 10 22H2 or Windows 11 23H2. Its CPU core is the Intel Core i5-6300HQ, equipped with the Video Adapter Intel(R) HD Graphics 530. Additionally, it boasts the NVIDIA GTX960M GPU

After trying: installed the latest version of the NVIDIA GPU driver and turned on the Windows Subsystem for Linux and Windows Hypervisor Platform features, set the wsl version to wsl2, ran any Linux distribution (such as Ubuntu 22.04) in wsl2, and then ran nvidia-smi in wsl, it would output segmentation fault, which means “C:\Windows\System32\lxss\lib\nvidia-smi” output segmentation fault
(We all know that the path of “C:\Windows\System32\lxss\lib\nvidia-smi” in wsl2 is actually displayed as “/mnt/c/Windows/System32/lxss/lib/nvidia-smi”)

Of course, there is no problem with the Windows version of nvidia-smi.exe

See Also:nvidia-smi segmentation fault in wsl2 but not in Windows · Issue #11277 · microsoft/WSL · GitHub

This is a problem that has not been solved since 2024 (538 and subsequent versions) until now

The 537.96 version (Geforce Vulkan BETA, supports CUDA12.2 at most) works normally. The download link is attached

https://developer.nvidia.com/downloads/vulkan-beta-53796-windows )

To add:I found that only the desktop versions of RTX 3050 and RTX4070 do not have this issue with me. I have asked Linux kernel development engineers, and they said that this issue needs to be fixed by you

The segmentation fault appears to happen in NVML (libnvidia-ml.so.1):

+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.07              Driver Version: 552.55         CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff66809a1 in ?? () from /usr/lib/wsl/drivers/nvltwi.inf_amd64_53dae1bddc8c687f/libnvidia-ml.so.1

As a smoke test, I tried generating some text using a HF Transformers language model. It didn’t crash and the output appeared fine.

As long as you don’t need NVML, you might be fine. From stack trace above, it’s not clear whether the problem is on the nvidia-smi or libnvidia-ml.so.1 side.

The Windows version of nvidia-smi can be run using nvidia-smi.exe from WSL. You can only see the total GPU utilization, though. The per-process break-down is not displayed:

+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 552.55                 Driver Version: 552.55         CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                     TCC/WDDM  | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA T500                  WDDM  |   00000000:01:00.0 Off |                  N/A |
| N/A   59C    P0             N/A / ERR!  |    1563MiB /   4096MiB |     93%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+

Has anybody tried some other NVML-based monitoring tool that worked under the older driver versions?