Wsl2 cuda-gdb is attached to a python process for debugging, and it gets stuck on certain code

cuda-toolkit version 12.3
Ubuntu 18
WSL2

I refer to this tutorial to debug my python files

Here is my test python file.(It didn’t call the .cu file,just for test)

import torch
b=[[4,5,6],[7,8,9]]
b=torch.tensor(b)
print("arrive here 11")
b=b.cuda()
print("arrive here 22")
print(b)
print(torch.unsqueeze(b,2).shape)

In step seven,my python process is blocked when executing b=b.cuda().

Without cuda-gdb attached,this code executes correctly .
What should I do to fix it? Looking forward to your reply。

The above works fine on a normal linux system but not on wsl

Hi @waitting33
Thank you for the report! To help us identify the issue, could you share additional information with us?

  • If available, could you share the nvidia-smi output of the WSL2 setup.
  • Enable additional logging and share the result logs with us.
    • Add NVLOG_CONFIG_FILE variable pointing the nvlog.config file (attached). E.g.: NVLOG_CONFIG_FILE=${HOME}/nvlog.config
      nvlog.config (539 Bytes)

    • Run the debugging session.

    • You should see the /tmp/debugger.log file created - could you share it with us?

Here is my log and nvidia-smi.

(base) PS C:\Users\zxy\Desktop> nvidia-smi
Sun Nov 19 09:54:06 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 546.17                 Driver Version: 546.17       CUDA Version: 12.3     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                     TCC/WDDM  | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 4060 ...  WDDM  | 00000000:01:00.0  On |                  N/A |
| N/A   46C    P8               4W / 115W |    958MiB /  8188MiB |      4%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A      1364    C+G   ...rPicker\PowerToys.ColorPickerUI.exe    N/A      |
|    0   N/A  N/A      1804    C+G   C:\Windows\System32\dwm.exe               N/A      |
|    0   N/A  N/A      2456    C+G   ...oogle\Chrome\Application\chrome.exe    N/A      |
|    0   N/A  N/A      3612    C+G   ...2txyewy\StartMenuExperienceHost.exe    N/A      |
|    0   N/A  N/A      3644    C+G   ...crosoft\Edge\Application\msedge.exe    N/A      |
|    0   N/A  N/A      4564    C+G   ...cal\Microsoft\OneDrive\OneDrive.exe    N/A      |
|    0   N/A  N/A      8848    C+G   ...\MobaXterm\slash\bin\XWin_MobaX.exe    N/A      |
|    0   N/A  N/A      9276    C+G   ...ams\rok-a29ca11b\云翼网络加速器.exe    N/A      |
|    0   N/A  N/A      9892    C+G   ...nt.CBS_cw5n1h2txyewy\SearchHost.exe    N/A      |
|    0   N/A  N/A     10140    C+G   C:\Windows\explorer.exe                   N/A      |
|    0   N/A  N/A     11408    C+G   ...CBS_cw5n1h2txyewy\TextInputHost.exe    N/A      |
|    0   N/A  N/A     11564    C+G   C:\Program Files\ToDesk\ToDesk.exe        N/A      |
|    0   N/A  N/A     11608    C+G   ...FancyZones\PowerToys.FancyZones.exe    N/A      |
|    0   N/A  N/A     11996    C+G   ...auncher\PowerToys.PowerLauncher.exe    N/A      |
|    0   N/A  N/A     14296    C+G   ...siveControlPanel\SystemSettings.exe    N/A      |
|    0   N/A  N/A     15520    C+G   ...es (x86)\Microsoft VS Code\Code.exe    N/A      |
|    0   N/A  N/A     17272    C+G   ...B\system_tray\lghub_system_tray.exe    N/A      |
|    0   N/A  N/A     17688    C+G   ...__8wekyb3d8bbwe\WindowsTerminal.exe    N/A      |
|    0   N/A  N/A     20440    C+G   C:\Windows\explorer.exe                   N/A      |
|    0   N/A  N/A     23644    C+G   ...5n1h2txyewy\ShellExperienceHost.exe    N/A      |
|    0   N/A  N/A     24368    C+G   ...0_x64__8wekyb3d8bbwe\HxAccounts.exe    N/A      |
|    0   N/A  N/A     25568    C+G   ....0_x64__8wekyb3d8bbwe\HxOutlook.exe    N/A      |
|    0   N/A  N/A     34304    C+G   ...t.LockApp_cw5n1h2txyewy\LockApp.exe    N/A      |
+---------------------------------------------------------------------------------------+

debugger.log (26.1 MB)

@waitting33
Thank you for the logs - we are investigating the issue. Could you also share the console output of the cuda-gdb when attaching to the application on WSL?

In the mean time - could you try running the application with theCUDA_MODULE_LOADING=EAGER environment variable?

I found that the program wasn’t stuck, it was executing very slowly. The process I ran yesterday was found to have finished executing today and could be output normally.

Here is cuda-gdb console output.

(sphere) zxy@DESKTOP-0TU3RE2:/mnt/e/jlu/SphereFormer$ cuda-gdb -p 4743
NVIDIA (R) CUDA Debugger
CUDA Toolkit 12.1 release
Portions Copyright (C) 2007-2023 NVIDIA Corporation
GNU gdb (GDB) 12.1
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
--Type <RET> for more, q to quit, c to continue without paging--
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
--Type <RET> for more, q to quit, c to continue without paging--
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word".
Attaching to process 4743
Reading symbols from /home/zxy/mambaforge/envs/sphere/bin/python3.7...
Reading symbols from /lib/x86_64-linux-gnu/libpthread.so.0...
(No debugging symbols found in /lib/x86_64-linux-gnu/libpthread.so.0)
Reading symbols from /lib/x86_64-linux-gnu/libdl.so.2...
(No debugging symbols found in /lib/x86_64-linux-gnu/libdl.so.2)
Reading symbols from /lib/x86_64-linux-gnu/libutil.so.1...
(No debugging symbols found in /lib/x86_64-linux-gnu/libutil.so.1)
Reading symbols from /lib/x86_64-linux-gnu/librt.so.1...
(No debugging symbols found in /lib/x86_64-linux-gnu/librt.so.1)
Reading symbols from /lib/x86_64-linux-gnu/libm.so.6...
(No debugging symbols found in /lib/x86_64-linux-gnu/libm.so.6)
Reading symbols from /lib/x86_64-linux-gnu/libc.so.6...
(No debugging symbols found in /lib/x86_64-linux-gnu/libc.so.6)
Reading symbols from /lib64/ld-linux-x86-64.so.2...
(No debugging symbols found in /lib64/ld-linux-x86-64.so.2)
Reading symbols from /home/zxy/mambaforge/envs/sphere/lib/python3.7/lib-dynload/_heapq.cpython-37m-x86_64-linux-gnu.so...
Reading symbols from /home/zxy/mambaforge/envs/sphere/lib/python3.7/lib-dynload/readline.cpython-37m-x86_64-linux-gnu.so...
Reading symbols from /home/zxy/mambaforge/envs/sphere/lib/python3.7/lib-dynload/../../libreadline.so.8...
(No debugging symbols found in /home/zxy/mambaforge/envs/sphere/lib/python3.7/lib-dynload/../../libreadline.so.8)
Reading symbols from /home/zxy/mambaforge/envs/sphere/lib/python3.7/lib-dynload/../.././libtinfo.so.6...
(No debugging symbols found in /home/zxy/mambaforge/envs/sphere/lib/python3.7/lib-dynload/../.././libtinfo.so.6)
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x00007f1d89690cd7 in select () from /lib/x86_64-linux-gnu/libc.so.6
(cuda-gdb) continue
Continuing.
[Detaching after fork from child process 4896]
[New Thread 0x7f1c9b2ad700 (LWP 4914)]
[New Thread 0x7f1c9aaac700 (LWP 4915)]
[New Thread 0x7f1c982ab700 (LWP 4916)]
[New Thread 0x7f1c93aaa700 (LWP 4917)]
[New Thread 0x7f1c912a9700 (LWP 4918)]
[New Thread 0x7f1c8eaa8700 (LWP 4919)]
[New Thread 0x7f1c8c2a7700 (LWP 4920)]
[New Thread 0x7f1c89aa6700 (LWP 4921)]
[New Thread 0x7f1c872a5700 (LWP 4922)]
[New Thread 0x7f1c86aa4700 (LWP 4923)]
[New Thread 0x7f1c842a3700 (LWP 4924)]
[New Thread 0x7f1c81aa2700 (LWP 4925)]
[New Thread 0x7f1c7d2a1700 (LWP 4926)]
[New Thread 0x7f1c7aaa0700 (LWP 4927)]
[New Thread 0x7f1c7829f700 (LWP 4928)]
[New Thread 0x7f1c75a9e700 (LWP 4929)]
[New Thread 0x7f1c7529d700 (LWP 4930)]
[New Thread 0x7f1c70a9c700 (LWP 4931)]
[New Thread 0x7f1c7029b700 (LWP 4932)]
[New Thread 0x7f1c669e5700 (LWP 4937)]
[New Thread 0x7f1c65700700 (LWP 4938)]
[Thread 0x7f1c86aa4700 (LWP 4923) exited]
[Thread 0x7f1c89aa6700 (LWP 4921) exited]
[Thread 0x7f1c8c2a7700 (LWP 4920) exited]
[Thread 0x7f1c93aaa700 (LWP 4917) exited]
[Thread 0x7f1c982ab700 (LWP 4916) exited]
[Thread 0x7f1c9aaac700 (LWP 4915) exited]
[Thread 0x7f1c912a9700 (LWP 4918) exited]
[Thread 0x7f1c9b2ad700 (LWP 4914) exited]
[Thread 0x7f1c75a9e700 (LWP 4929) exited]
[Thread 0x7f1c70a9c700 (LWP 4931) exited]
[Thread 0x7f1c7529d700 (LWP 4930) exited]
[Thread 0x7f1c7aaa0700 (LWP 4927) exited]
[Thread 0x7f1c7029b700 (LWP 4932) exited]
[Thread 0x7f1c7d2a1700 (LWP 4926) exited]
[Thread 0x7f1c842a3700 (LWP 4924) exited]
[Thread 0x7f1c872a5700 (LWP 4922) exited]
[Thread 0x7f1c8eaa8700 (LWP 4919) exited]
[Thread 0x7f1c81aa2700 (LWP 4925) exited]
[Thread 0x7f1c7829f700 (LWP 4928) exited]
[Detaching after fork from child process 4939]
[New Thread 0x7f1c7029b700 (LWP 4948)]
[Thread 0x7f1c7029b700 (LWP 4948) exited]
[New Thread 0x7f1c7029b700 (LWP 4949)]

Hi @waitting33
Than you for the update! Would you be able to try the same scenario, but launch the app with the CUDA_MODULE_LOADING=EAGER environment variable?

Hello, I tried to use CUDA_MODULE_LOADING=EAGER, but the program is still executing very slowly, it’s been ten minutes now, and still the program is still not finished.

Is there a way, we can reproduce this locally? Do you have an app, which you can share with us, which triggers this issue?

Also, could you share the debugger.log (see Wsl2 cuda-gdb is attached to a python process for debugging, and it gets stuck on certain code - #3 by AKravets) for the run with CUDA_MODULE_LOADING=EAGER?

Sorry for this one I don’t really know how to reproduce it, I don’t have any other apps open except pycharm. This is the log under CUDA_MODULE_LOADING=EAGER.
debugger.zip (12.8 MB)

Hi @waitting33
Thank you for the updated logs, we will try to diagnose the issue based on the logs provided.