I am using Visual Studio Code (VSCode) on Windows 11 to connect to WSL2 (Ubuntu 22.04) and encountering an error when attempting to debug with cuda-gdb
. Strangely, the compiled executable runs flawlessly when executed directly.
NVIDIA (R) CUDA Debugger
CUDA Toolkit 12.1 release
Portions Copyright (C) 2007-2023 NVIDIA Corporation
GNU gdb (GDB) 12.1
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
warning: Cuda API error detected: cuModuleLoadFatBinary returned (0xd1)
warning: Cuda API error detected: cuModuleLoadFatBinary returned (0xd1)
My setup is as follows:
==============NVSMI LOG==============
Timestamp : Tue Apr 30 00:09:53 2024
Driver Version : 552.22
CUDA Version : 12.4
Attached GPUs : 1
GPU 00000000:01:00.0
Product Name : NVIDIA GeForce RTX 3060 Laptop GPU
Product Brand : GeForce
Product Architecture : Ampere
Display Mode : Enabled
Display Active : Enabled
Persistence Mode : Enabled
Addressing Mode : N/A
MIG Mode
Current : N/A
Pending : N/A
Accounting Mode : Disabled
Accounting Mode Buffer Size : 4000
Driver Model
Current : WDDM
Pending : WDDM
Serial Number : N/A
GPU UUID : GPU-39ea9d63-045b-112d-1973-27cc4470dba9
Minor Number : N/A
VBIOS Version : 94.06.19.00.3e
MultiGPU Board : No
Board ID : 0x100
Board Part Number : N/A
GPU Part Number : 2520-775-A1
FRU Part Number : N/A
Module ID : 1
Inforom Version
Image Version : G001.0000.03.03
OEM Object : 2.0
ECC Object : N/A
Power Management Object : N/A
Inforom BBX Object Flush
Latest Timestamp : N/A
Latest Duration : N/A
GPU Operation Mode
Current : N/A
Pending : N/A
GPU C2C Mode : N/A
GPU Virtualization Mode
Virtualization Mode : None
Host VGPU Mode : N/A
vGPU Heterogeneous Mode : N/A
GPU Reset Status
Reset Required : No
Drain and Reset Recommended : N/A
GSP Firmware Version : N/A
IBMNPU
Relaxed Ordering Mode : N/A
PCI
Bus : 0x01
Device : 0x00
Domain : 0x0000
Device Id : 0x252010DE
Bus Id : 00000000:01:00.0
Sub System Id : 0x380117AA
GPU Link Info
PCIe Generation
Max : 3
Current : 3
Device Current : 3
Device Max : 4
Host Max : 3
Link Width
Max : 16x
Current : 8x
Bridge Chip
Type : N/A
Firmware : N/A
Replays Since Reset : 0
Replay Number Rollovers : 0
Tx Throughput : 2 KB/s
Rx Throughput : 4 KB/s
Atomic Caps Inbound : N/A
Atomic Caps Outbound : N/A
Fan Speed : N/A
Performance State : P8
Clocks Event Reasons
Idle : Active
Applications Clocks Setting : Not Active
SW Power Cap : Not Active
HW Slowdown : Not Active
HW Thermal Slowdown : Not Active
HW Power Brake Slowdown : Not Active
Sync Boost : Not Active
SW Thermal Slowdown : Not Active
Display Clock Setting : Not Active
Sparse Operation Mode : N/A
FB Memory Usage
Total : 6144 MiB
Reserved : 148 MiB
Used : 1239 MiB
Free : 4758 MiB
BAR1 Memory Usage
Total : 8192 MiB
Used : 1 MiB
Free : 8191 MiB
Conf Compute Protected Memory Usage
Total : N/A
Used : N/A
Free : N/A
Compute Mode : Default
Utilization
Gpu : 5 %
Memory : 16 %
Encoder : 0 %
Decoder : 0 %
JPEG : 0 %
OFA : 0 %
Encoder Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
FBC Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
ECC Mode
Current : N/A
Pending : N/A
ECC Errors
Volatile
SRAM Correctable : N/A
SRAM Uncorrectable Parity : N/A
SRAM Uncorrectable SEC-DED : N/A
DRAM Correctable : N/A
DRAM Uncorrectable : N/A
Aggregate
SRAM Correctable : N/A
SRAM Uncorrectable Parity : N/A
SRAM Uncorrectable SEC-DED : N/A
DRAM Correctable : N/A
DRAM Uncorrectable : N/A
SRAM Threshold Exceeded : N/A
Aggregate Uncorrectable SRAM Sources
SRAM L2 : N/A
SRAM SM : N/A
SRAM Microcontroller : N/A
SRAM PCIE : N/A
SRAM Other : N/A
Retired Pages
Single Bit ECC : N/A
Double Bit ECC : N/A
Pending Page Blacklist : N/A
Remapped Rows : N/A
Temperature
GPU Current Temp : 39 C
GPU T.Limit Temp : N/A
GPU Shutdown Temp : 105 C
GPU Slowdown Temp : 102 C
GPU Max Operating Temp : 105 C
GPU Target Temperature : 87 C
Memory Current Temp : N/A
Memory Max Operating Temp : N/A
GPU Power Readings
Power Draw : 13.17 W
Current Power Limit : 75.00 W
Requested Power Limit : N/A
Default Power Limit : 60.00 W
Min Power Limit : 1.00 W
Max Power Limit : 75.00 W
GPU Memory Power Readings
Power Draw : N/A
Module Power Readings
Power Draw : N/A
Current Power Limit : N/A
Requested Power Limit : N/A
Default Power Limit : N/A
Min Power Limit : N/A
Max Power Limit : N/A
Clocks
Graphics : 210 MHz
SM : 210 MHz
Memory : 405 MHz
Video : 555 MHz
Applications Clocks
Graphics : N/A
Memory : N/A
Default Applications Clocks
Graphics : N/A
Memory : N/A
Deferred Clocks
Memory : N/A
Max Clocks
Graphics : 2100 MHz
SM : 2100 MHz
Memory : 6001 MHz
Video : 1950 MHz
Max Customer Boost Clocks
Graphics : N/A
Clock Policy
Auto Boost : N/A
Auto Boost Default : N/A
Voltage
Graphics : 643.750 mV
Fabric
State : N/A
Status : N/A
CliqueId : N/A
ClusterUUID : N/A
Health
Bandwidth : N/A
Processes : None
The debugging target is a simple matrix multiplication example using cuSparseLt, specifically the matmul_example.cpp
file. The issue arises during step-by-step debugging at the line CHECK_CUSPARSE( cusparseLtInit(&handle) )
, leading me to suspect a problem with cuSparseLt. My task.json
and launch.json
configurations are configured as such:
task.json
{
"version": "2.0.0",
"tasks": [
{
"label": "mynvcc",
"type": "shell",
"command": "nvcc",
"args": ["-lcusparse","-g","-G","-o","${fileDirname}/test","${file}"]
},
{
"label": "mynvcc2",
"type": "shell",
"command": "nvcc",
"args": ["-lcublas","-g","-G","-o","${fileDirname}/test","${file}"]
},
{
"label": "cusparseLtNvcc",
"type": "shell",
"command": "nvcc",
"args": ["-lcusparseLt","-lcusparse","-ldl","-gencode","arch=compute_80,code=sm_80","-g","-G","-o","${fileDirname}/test","${file}"]
},
{
"label": "cusparseLtNvccStatic",
"type": "shell",
"command": "nvcc",
"args": ["-lcusparseLt","-lcusparse","-lcudart","-lcuda","-Xlinker=/home/whh/workspace/env/cusparseLt/lib64/libcusparseLt_static.a","-ldl","-gencode","arch=compute_80,code=sm_80","-g","-G","-o","${fileDirname}/test_static","${file}"]
}
]
}
launch.json
{
// 使用 IntelliSense 了解相关属性。
// 悬停以查看现有属性的描述。
// 欲了解更多信息,请访问: https://go.microsoft.com/fwlink/?linkid=830387
"version": "0.2.0",
"configurations": [
{
"name": "Cusparse C++: Launch",
"type": "cuda-gdb",
"request": "launch",
"program": "${fileDirname}/test",
"preLaunchTask": "mynvcc"
},
{
"name": "Cusparse C++: Attach",
"type": "cuda-gdb",
"request": "attach"
},
{
"name": "Cublas C++: Launch",
"type": "cuda-gdb",
"request": "launch",
"program": "${fileDirname}/test",
"preLaunchTask": "mynvcc2"
},
{
"name": "Cublas C++: Attach",
"type": "cuda-gdb",
"request": "attach"
},
{
"name": "CusparseLt: Launch",
"type": "cuda-gdb",
"debuggerPath": "/usr/local/cuda/bin/cuda-gdb",
"request": "launch",
"program": "${fileDirname}/test",
"preLaunchTask": "cusparseLtNvcc"
},
{
"name": "CusparseLt: Attach",
"type": "cuda-gdb",
"debuggerPath": "/usr/local/cuda/bin/cuda-gdb",
"request": "attach",
"program": "${fileDirname}/test",
"preLaunchTask": "cusparseLtNvcc"
},
{
"name": "CusparseLtStatic: Launch",
"type": "cuda-gdb",
"debuggerPath": "/usr/local/cuda/bin/cuda-gdb",
"request": "launch",
"program": "${fileDirname}/test_static",
"preLaunchTask": "cusparseLtNvccStatic"
},
]
}
Regarding cuSparseLt, it’s peculiar that after downloading version 0.6.0, there isn’t a distinct cuSparseLt
folder, whereas the official tutorials reference paths like ${CUSPARSELT_DIR}/include
. This discrepancy is confusing. The error message I receive during debugging corresponds to:
CUDA_ERROR_NO_BINARY_FOR_GPU = 209
This indicates that there is no kernel image available that is suitable for the device. This can occur when a user specifies code generation options for a particular CUDA source file that do not include the corresponding device configuration.
but I’m at a loss for how to address this particular issue. Any guidance or insights from the CUDA community would be greatly appreciated.
By the way, my cuda driver version is :
NVIDIA-SMI 550.76.01 Driver Version: 552.22 CUDA Version: 12.4
my cuda_toolkit version is :
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Tue_Feb__7_19:32:13_PST_2023
Cuda compilation tools, release 12.1, V12.1.66
Build cuda_12.1.r12.1/compiler.32415258_0