"A Unwinder should return gdb.UnwindInfo instance." error with NVIDIA Nsight Visual Studio Code Edition

I’m following the guide in the documentation Getting Started with the CUDA Debugger :: NVIDIA Nsight VSCE Documentation and setup VS Code with Nsight plugin. I’m on the most recent version of VS Code (Version: 1.79.2 (Universal)) and Nsight (v2023.2.32964508). I was able to build the matrixMul code and launch the cuda-gdb debugger. However, I get the following error message in VS Code: “A Unwinder should return gdb.UnwindInfo instance.”. This shows for all aspects of the call stack, the locals don’t contain any values with lots of variables showing “<optimized out>” and when I hover over variables in the CUDA kernel, it shows it a blank black box. I have the -g -G nvcc flags set as that is part of the tutorial based on make dbg=1, but have included the nvcc command output as well.

Here is the output from the cuda-gdb terminal:

NVIDIA (R) CUDA Debugger
11.8 release
Portions Copyright (C) 2007-2022 NVIDIA Corporation
GNU gdb (GDB) 10.2
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word".
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff529a000 (LWP 1350557)]
[Detaching after fork from child process 1350558]
[New Thread 0x7fffe941a000 (LWP 1350581)]
[New Thread 0x7fffe8c19000 (LWP 1350582)]

Thread 1 "matrixMul" hit Breakpoint 3, 
Thread 1 "matrixMul" hit Breakpoint 1, MatrixMulCUDA<32><<<(20,10,1),(32,32,1)>>> (C=, A=, B=, wA=, wB=) at matrixMul.cu:70
70	  int aBegin = wA * BLOCK_SIZE * by;
cuda block (0, 0, 0) thread (0, 0, 0)
CUDA focus unchanged.
cuda block (0, 0, 0) thread (0, 0, 0)
[Switching focus to CUDA kernel 0, grid 1, block (0,0,0), thread (0,0,0), device 0, sm 0, warp 0, lane 0]
70	  int aBegin = wA * BLOCK_SIZE * by;
cuda block (0, 0, 0) thread (0, 0, 0)
[Switching focus to CUDA kernel 0, grid 1, block (0,0,0), thread (0,0,0), device 0, sm 0, warp 0, lane 0]
70	  int aBegin = wA * BLOCK_SIZE * by;
cuda block (0, 0, 0) thread (0, 0, 0)
[Switching focus to CUDA kernel 0, grid 1, block (0,0,0), thread (0,0,0), device 0, sm 0, warp 0, lane 0]
70	  int aBegin = wA * BLOCK_SIZE * by;
cuda block (0, 0, 0) thread (0, 0, 0)
[Switching focus to CUDA kernel 0, grid 1, block (0,0,0), thread (0,0,0), device 0, sm 0, warp 0, lane 0]
70	  int aBegin = wA * BLOCK_SIZE * by;

Here is nvcc build command:

nvcc -ccbin g++ -I../../../Common  -m64 -g -G -Xcompiler -O0 -Xptxas -O0 -lineinfo -O0    --threads 0 --std=c++11 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_89,code=sm_89 -gencode arch=compute_90,code=sm_90 -gencode arch=compute_90,code=compute_90 -o matrixMul.o -c matrixMul.cu

nvcc -ccbin g++   -m64 -g -G -Xcompiler -O0 -Xptxas -O0 -lineinfo -O0      -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_89,code=sm_89 -gencode arch=compute_90,code=sm_90 -gencode arch=compute_90,code=compute_90 -o matrixMul matrixMul.o 

I added -lineinfo from a SO post that mentioned it but it appears that -G overrides that.

Here is a screenshot of the local variables (I’m only able to add one screenshot).

I’m connected via remote-ssh plugin to an A100x8 Ubuntu 20.04.5 server.

I added the logFile option to the launch and am including the relevant log information.

[17:27:06.813 UTC] GDB result: 11 done,InfoCudaDevicesTable={nr_rows="8",nr_cols="11",hdr=[{width="1",alignment="1",col_name="current",colhdr=" "},{width="3",alignment="1",col_name="device",colhdr="Dev"},{width="14",alignment="1",col_name="pci_bus",colhdr="PCI Bus/Dev ID"},{width="21",alignment="1",col_name="name",colhdr="Name"},{width="11",alignment="1",col_name="description",colhdr="Description"},{width="7",alignment="1",col_name="sm_type",colhdr="SM Type"},{width="3",alignment="1",col_name="num_sms",colhdr="SMs"},{width="8",alignment="1",col_name="num_warps",colhdr="Warps/SM"},{width="10",alignment="1",col_name="num_lanes",colhdr="Lanes/Warp"},{width="13",alignment="1",col_name="num_regs",colhdr="Max Regs/Lane"},{width="66",alignment="1",col_name="active_sms_mask",colhdr="Active SMs Mask"}],body=[InfoCudaDevicesRow={current=" ",device="0",pci_bus="06:00.0",name="NVIDIA A100-SXM4-80GB",description="GA100GL-A",sm_type="sm_80",num_sms="108",num_warps="64",num_lanes="32",num_regs="256",active_sms_mask="0x000000000000000
[17:27:06.813 UTC] GDB -cont-: 11 0000000000000000000000000000000000000000000000000"},InfoCudaDevicesRow={current=" ",device="1",pci_bus="07:00.0",name="NVIDIA A100-SXM4-80GB",description="GA100GL-A",sm_type="sm_80",num_sms="108",num_warps="64",num_lanes="32",num_regs="256",active_sms_mask="0x0000000000000000000000000000000000000000000000000000000000000000"},InfoCudaDevicesRow={current=" ",device="2",pci_bus="08:00.0",name="NVIDIA A100-SXM4-80GB",description="GA100GL-A",sm_type="sm_80",num_sms="108",num_warps="64",num_lanes="32",num_regs="256",active_sms_mask="0x0000000000000000000000000000000000000000000000000000000000000000"},InfoCudaDevicesRow={current=" ",device="3",pci_bus="09:00.0",name="NVIDIA A100-SXM4-80GB",description="GA100GL-A",sm_type="sm_80",num_sms="108",num_warps="64",num_lanes="32",num_regs="256",active_sms_mask="0x0000000000000000000000000000000000000000000000000000000000000000"},InfoCudaDevicesRow={current=" ",device="4",pci_bus="0a:00.0",name="NVIDIA A100-SXM4-80GB",description="GA100GL-A",sm_type="
[17:27:06.813 UTC] GDB -cont-: 11 sm_80",num_sms="108",num_warps="64",num_lanes="32",num_regs="256",active_sms_mask="0x0000000000000000000000000000000000000000000000000000000000000000"},InfoCudaDevicesRow={current=" ",device="5",pci_bus="0b:00.0",name="NVIDIA A100-SXM4-80GB",description="GA100GL-A",sm_type="sm_80",num_sms="108",num_warps="64",num_lanes="32",num_regs="256",active_sms_mask="0x0000000000000000000000000000000000000000000000000000000000000000"},InfoCudaDevicesRow={current=" ",device="6",pci_bus="0c:00.0",name="NVIDIA A100-SXM4-80GB",description="GA100GL-A",sm_type="sm_80",num_sms="108",num_warps="64",num_lanes="32",num_regs="256",active_sms_mask="0x0000000000000000000000000000000000000000000000000000000000000000"},InfoCudaDevicesRow={current=" ",device="7",pci_bus="0d:00.0",name="NVIDIA A100-SXM4-80GB",description="GA100GL-A",sm_type="sm_80",num_sms="108",num_warps="64",num_lanes="32",num_regs="256",active_sms_mask="0x0000000000000000000000000000000000000000000000000000000000000000"}]}
[17:27:06.814 UTC] To client: {"seq":0,"type":"response","request_seq":6,"command":"systemInfo","success":true,"body":{"systemInfo":{"os":{"platform":"linux","architecture":"x64","distribution":"ubuntu","distributionVersion":"20.04"},"gpus":[{"name":"NVIDIA A100-SXM4-80GB","description":"GA100GL-A","smType":"sm_80"},{"name":"NVIDIA A100-SXM4-80GB","description":"GA100GL-A","smType":"sm_80"},{"name":"NVIDIA A100-SXM4-80GB","description":"GA100GL-A","smType":"sm_80"},{"name":"NVIDIA A100-SXM4-80GB","description":"GA100GL-A","smType":"sm_80"},{"name":"NVIDIA A100-SXM4-80GB","description":"GA100GL-A","smType":"sm_80"},{"name":"NVIDIA A100-SXM4-80GB","description":"GA100GL-A","smType":"sm_80"},{"name":"NVIDIA A100-SXM4-80GB","description":"GA100GL-A","smType":"sm_80"},{"name":"NVIDIA A100-SXM4-80GB","description":"GA100GL-A","smType":"sm_80"}]}}}
[17:27:06.869 UTC] GDB result: 12 done,threads=[{id="1",target-id="Thread 0x7ffff7a3c000 (LWP 1380696)",name="matrixMul",state="stopped",core="36"},{id="2",target-id="Thread 0x7fffee914000 (LWP 1380727)",name="cuda-EvtHandlr",state="stopped",core="37"},{id="3",target-id="Thread 0x7fffee113000 (LWP 1380728)",name="cuda-EvtHandlr",state="stopped",core="39"}],current-thread-id="1"
[17:27:06.869 UTC] To client: {"seq":0,"type":"response","request_seq":8,"command":"threads","success":true,"body":{"threads":[{"id":99999,"name":"(CUDA)"},{"id":1,"name":"matrixMul"},{"id":2,"name":"cuda-EvtHandlr"},{"id":3,"name":"cuda-EvtHandlr"}]}}
[17:27:06.912 UTC] From client: stackTrace({"threadId":1,"startFrame":0,"levels":20})
[17:27:06.912 UTC] GDB command: 13 -stack-info-depth --thread 1 100
[17:27:06.912 UTC] GDB result: 13 error,msg="A Unwinder should return gdb.UnwindInfo instance."
[17:27:06.913 UTC] To client: {"seq":0,"type":"response","request_seq":9,"command":"stackTrace","success":false,"message":"A Unwinder should return gdb.UnwindInfo instance.","body":{"error":{"id":1,"format":"A Unwinder should return gdb.UnwindInfo instance.","showUser":true}}}
[17:27:09.099 UTC] From client: disconnect({"restart":false})
[17:27:09.099 UTC] GDB command: 14 -gdb-exit
[17:27:09.099 UTC] GDB result: 14 exit

Hi!

I noticed you are using cuda-gdb 11.8. Could you try upgrading to 12.0+?
Here is the latest CTK CUDA Toolkit 12.2 Downloads | NVIDIA Developer.

I’m trying to upgrade and get the following error:

The following packages have unmet dependencies:
 cuda : Depends: cuda-12-2 (>= 12.2.0) but it is not going to be installed
E: Unable to correct problems, you have held broken packages.

If you are on a Ubuntu system can you try:

sudo apt update
sudo apt purge nvidia-* 
sudo apt autoremove
sudo apt install -y cuda

My suggestions are based on this post - CUDA install unmet dependencies: cuda : Depends: cuda-10-0 (>= 10.0.130) but it is not going to be installed - #6 by rojegi

I can now run nvidia-smi and get the following:

±--------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.54.03 Driver Version: 535.54.03 CUDA Version: 12.2 |
|-----------------------------------------±---------------------±---------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA A100-SXM4-80GB On | 00000000:06:00.0 Off | 0 |
| N/A 35C P0 64W / 400W | 4MiB / 81920MiB | 0% Default |
| | | Disabled |
±----------------------------------------±---------------------±---------------------+
| 1 NVIDIA A100-SXM4-80GB On | 00000000:07:00.0 Off | 0 |
| N/A 34C P0 67W / 400W | 4MiB / 81920MiB | 0% Default |
| | | Disabled |
±----------------------------------------±---------------------±---------------------+
| 2 NVIDIA A100-SXM4-80GB On | 00000000:08:00.0 Off | 0 |
| N/A 35C P0 64W / 400W | 4MiB / 81920MiB | 0% Default |
| | | Disabled |
±----------------------------------------±---------------------±---------------------+
| 3 NVIDIA A100-SXM4-80GB On | 00000000:09:00.0 Off | 0 |
| N/A 36C P0 71W / 400W | 4MiB / 81920MiB | 0% Default |
| | | Disabled |
±----------------------------------------±---------------------±---------------------+
| 4 NVIDIA A100-SXM4-80GB On | 00000000:0A:00.0 Off | 0 |
| N/A 35C P0 70W / 400W | 4MiB / 81920MiB | 0% Default |
| | | Disabled |
±----------------------------------------±---------------------±---------------------+
| 5 NVIDIA A100-SXM4-80GB On | 00000000:0B:00.0 Off | 0 |
| N/A 34C P0 72W / 400W | 4MiB / 81920MiB | 0% Default |
| | | Disabled |
±----------------------------------------±---------------------±---------------------+
| 6 NVIDIA A100-SXM4-80GB On | 00000000:0C:00.0 Off | 0 |
| N/A 36C P0 70W / 400W | 4MiB / 81920MiB | 0% Default |
| | | Disabled |
±----------------------------------------±---------------------±---------------------+
| 7 NVIDIA A100-SXM4-80GB On | 00000000:0D:00.0 Off | 0 |
| N/A 38C P0 70W / 400W | 4MiB / 81920MiB | 3% Default |
| | | Disabled |
±----------------------------------------±---------------------±---------------------+

But I cannot use nvcc or cuda-gdb:

nvcc --version
-bash: /usr/bin/nvcc: No such file or directory

cuda-gdb --version
-bash: /usr/bin/cuda-gdb: No such file or directory

Got that working, and now I get this:

[19:01:59.065 UTC] GDB command: 9 -cuda-info-devices
[19:01:59.188 UTC] To client: {"seq":0,"type":"event","event":"output","body":{"category":"log","output":"fatal:  No CUDA capable device was found. (error code = CUDBG_ERROR_NO_DEVICE_AVAILABLE(0x27)\n"}}
[19:01:59.188 UTC] GDB result: 9 done,InfoCudaDevicesTable={nr_rows="0",nr_cols="11",hdr=[{width="1",alignment="1",col_name="current",colhdr=" "},{width="3",alignment="1",col_name="device",colhdr="Dev"},{width="14",alignment="1",col_name="pci_bus",colhdr="PCI Bus/Dev ID"},{width="4",alignment="1",col_name="name",colhdr="Name"},{width="11",alignment="1",col_name="description",colhdr="Description"},{width="7",alignment="1",col_name="sm_type",colhdr="SM Type"},{width="3",alignment="1",col_name="num_sms",colhdr="SMs"},{width="8",alignment="1",col_name="num_warps",colhdr="Warps/SM"},{width="10",alignment="1",col_name="num_lanes",colhdr="Lanes/Warp"},{width="13",alignment="1",col_name="num_regs",colhdr="Max Regs/Lane"},{width="15",alignment="1",col_name="active_sms_mask",colhdr="Active SMs Mask"}],body=[]}
[19:01:59.188 UTC] To client: {"seq":0,"type":"response","request_seq":6,"command":"systemInfo","success":true,"body":{"systemInfo":{"os":{"platform":"linux","architecture":"x64","distribution":"ubuntu","distributionVersion":"20.04"},"gpus":[]}}}

Do you mean nvidia-smi works now? From the previous message it seemed like the toolkit installation did not go through properly. (It seemed like the files are all in place but some dependencies are not being properly resolved)

Can you share the output of doing ldd /usr/local/cuda-12.2/bin/nvcc and file /usr/local/cuda-12.2/bin/nvcc on your system? This is assuming nvcc is present in the default path on your system. If not, you can replace it with the path that you installed it to.

I was able to get nvidia-smi to work but there were issues with the fabric manager and nvidia settings. I could only find packages for 5.25 not 5.35.

After uninstalling multiple times, I finally got the upgrade to fully work with fabric manager and this upgrade resolved the problem! Thanks! Basically, I had to wipe everything there and use the runfile:

wget https://developer.download.nvidia.com/compute/cuda/12.2.0/local_installers/cuda_12.2.0_535.54.03_linux.run
sudo sh cuda_12.2.0_535.54.03_linux.run

There was an issue installing the driver b/c nvidia resource was running (e.g. sudo rmmod nvidia command failed). And so had to run sudo lsof | grep nvidia to find the processes using it (it was fabric manager.).

This answer drivers - "Module nvidia is in use" but there are no processes running on the GPU - Ask Ubuntu helped kill things in use so I could properly uninstall things and then install it correctly with the run file.

Glad to know it’s fixed!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.