GeForce GTX960M nvidia-smi segmentation fault in wsl2

zhoneym · June 2, 2024, 4:48pm

This is a Lenovo laptop, capable of running Windows 10 22H2 or Windows 11 23H2. Its CPU core is the Intel Core i5-6300HQ, equipped with the Video Adapter Intel(R) HD Graphics 530. Additionally, it boasts the NVIDIA GTX960M GPU

After trying: installed the latest version of the NVIDIA GPU driver and turned on the Windows Subsystem for Linux and Windows Hypervisor Platform features, set the wsl version to wsl2, ran any Linux distribution (such as Ubuntu 22.04) in wsl2, and then ran nvidia-smi in wsl, it would output segmentation fault, which means “C:\Windows\System32\lxss\lib\nvidia-smi” output segmentation fault
(We all know that the path of “C:\Windows\System32\lxss\lib\nvidia-smi” in wsl2 is actually displayed as “/mnt/c/Windows/System32/lxss/lib/nvidia-smi”)

Of course, there is no problem with the Windows version of nvidia-smi.exe

See Also：nvidia-smi segmentation fault in wsl2 but not in Windows · Issue #11277 · microsoft/WSL · GitHub

This is a problem that has not been solved since 2024 (538 and subsequent versions) until now

The 537.96 version (Geforce Vulkan BETA, supports CUDA12.2 at most) works normally. The download link is attached

https://developer.nvidia.com/downloads/vulkan-beta-53796-windows )

To add：I found that only the desktop versions of RTX 3050 and RTX4070 do not have this issue with me. I have asked Linux kernel development engineers, and they said that this issue needs to be fixed by you

zhoneym · July 8, 2024, 7:57am

The segmentation fault appears to happen in NVML (libnvidia-ml.so.1):

+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.07              Driver Version: 552.55         CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff66809a1 in ?? () from /usr/lib/wsl/drivers/nvltwi.inf_amd64_53dae1bddc8c687f/libnvidia-ml.so.1

As a smoke test, I tried generating some text using a HF Transformers language model. It didn’t crash and the output appeared fine.

As long as you don’t need NVML, you might be fine. From stack trace above, it’s not clear whether the problem is on the nvidia-smi or libnvidia-ml.so.1 side.

The Windows version of nvidia-smi can be run using nvidia-smi.exe from WSL. You can only see the total GPU utilization, though. The per-process break-down is not displayed:

+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 552.55                 Driver Version: 552.55         CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                     TCC/WDDM  | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA T500                  WDDM  |   00000000:01:00.0 Off |                  N/A |
| N/A   59C    P0             N/A / ERR!  |    1563MiB /   4096MiB |     93%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+

Has anybody tried some other NVML-based monitoring tool that worked under the older driver versions?

Topic		Replies	Views
GeForce GTX960M nvidia-smi segmentation fault in wsl2 Drivers - Linux, Windows, MacOS wsl , nvidia-smi	2	758	June 2, 2024
Nvidia-smi segmentation fault in wsl2 but not in Windows Drivers - Linux, Windows, MacOS wsl , nvidia-smi , windows-driver	1	700	July 28, 2024
Nvidia-smi segmentation fault in wsl2 but not in Windows CUDA on Windows Subsystem for Linux wsl , nvidia-smi	2	3714	May 29, 2024
How to update NVIDIA-SMI and NVML in wsl-ubuntu? CUDA on Windows Subsystem for Linux	2	1575	December 8, 2024
Nvidia-smi works on Win11 host but on WSL Ubuntu (segmentation fault) CUDA on Windows Subsystem for Linux cuda , ubuntu , wsl , laptop	0	543	May 1, 2024
Segmentation fault in cuInit() CUDA on Windows Subsystem for Linux	8	3408	August 4, 2021
/dev/dxg not visible and NVIDIA-SMI couldn't find libnvidia-ml.so library CUDA on Windows Subsystem for Linux cuda , ubuntu , wsl , gpu	4	1708	July 26, 2023
Solution for nvidia-smi problems CUDA on Windows Subsystem for Linux	2	3129	June 2, 2021
Nvidia-smi error and function not found CUDA on Windows Subsystem for Linux	3	5633	April 21, 2023
CUDA error when running ./BlackScholes and nvidia-smi failure on WSL2 CUDA on Windows Subsystem for Linux	3	1897	October 12, 2021

GeForce GTX960M nvidia-smi segmentation fault in wsl2

Related topics