Cuda-GDB doesn't hit breakpoints inside kernel/ if the kernel is in a library and then linked to the executable

Hello,

I have a problem when trying to debug the GPU part of the code. The breakpoints in kernels and device functions cannot be hit, when they are in a cuda library which then is linked to an executable. The breakpoints seem to “spring to the end of the kernel function”. When everything is compiled in one monolithic executable I can hit those break points. Is there a solution for this?

Working system: Linux 18.04, cuda version: 11.6, GPU: NVIDIA GeForce RTX 3070 Laptop GPU, Driver Version: 510.39.01

The behavior can be reproduced with a simple project, that can be found here: GitHub - kefalakis/cuda_gdb

Thanks for submitting this. For completeness, I see a .vscode directory in the repo. Were you using the VSCode console, VSCode GUI, or CUDA-GDB from a normal shell to set the breakpoint?

I used both VSCode GUI and CUDA-GDB.
In VSCode GUI I don’t get any warning or error in the debug window, the breakpoint just jumps to line 8.
While in CUDA-GDB I get the following “error”:
Single stepping until exit from function _Z6kernelv, which has no line number information.

PS the break point is set in the kernel function (main.cu line 5) and the it jumps to line 8.

Hi @nionios

I managed to reproduce this issue locally (with recent CUDA version).

The breakpoints in kernels and device functions cannot be hit, when they are in a cuda library which then is linked to an executable.

Can you share the commands used to set the breakpoints? Using the not_debugable binary I got the following:

  • Setting via file:line - not working.
  • Setting via kernel name - works:
(cuda-gdb) b kernel
Breakpoint 2 at 0x5555555cfb92: file /home/akravets/Downloads/cuda_gdb-master/libr/main.cu, line 5.
(cuda-gdb) r
Starting program: /home/akravets/Downloads/cuda_gdb-master/not_debugable 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff540b000 (LWP 1094100)]
[New Thread 0x7fffeffff000 (LWP 1094101)]
[Detaching after fork from child process 1094102]
[New Thread 0x7fffef2dd000 (LWP 1094112)]
[New Thread 0x7fffee438000 (LWP 1094113)]
[Switching focus to CUDA kernel 0, grid 1, block (0,0,0), thread (0,0,0), device 0, sm 0, warp 0, lane 0]

Thread 1 "not_debugable" hit Breakpoint 2, 0x00007fffc7036800 in kernel()<<<(1,1,1),(5,1,1)>>> ()
(cuda-gdb) l
1       /tmp/tmpxft_0010afa7_00000000-3_main.fatbin.c: No such file or directory.

Note that due to separable compilation the file name is different.

  • Using cuda break_on_launch - works
(cuda-gdb) set cuda break_on_launch all
(cuda-gdb) r
The program being debugged has been started already.
Start it from the beginning? (y or n) y 
Starting program: /home/akravets/Downloads/cuda_gdb-master/not_debugable 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff540b000 (LWP 1094128)]
[New Thread 0x7fffeffff000 (LWP 1094129)]
[Detaching after fork from child process 1094130]
[New Thread 0x7fffef2dd000 (LWP 1094141)]
[New Thread 0x7fffee438000 (LWP 1094142)]
[Switching focus to CUDA kernel 1, grid 1, block (0,0,0), thread (0,0,0), device 0, sm 0, warp 0, lane 0]
0x00007fffc7036800 in kernel()<<<(1,1,1),(5,1,1)>>> ()

Can you check whether setting breakpoint via kernel name works for you?

Thanks for your response, the problem comes right after this line, so the breakpoint is in line 5 now and then if you type next then i get the warning, and it doesn’t go to line 6 but line 8 which is the end of the GPU part. Can you test if this is happening to you as well.

Breakpoint 1, kernel () at /home/nionios/git/test/cuda_gdb/libr/main.cu:5
5	{
(cuda-gdb) next
[Detaching after fork from child process 14787]
[New Thread 0x7fffef247700 (LWP 14793)]
[New Thread 0x7fffeea46700 (LWP 14794)]
[New Thread 0x7fffee143700 (LWP 14795)]
[Switching focus to CUDA kernel 0, grid 1, block (0,0,0), thread (0,0,0), device 0, sm 0, warp 0, lane 0]

Thread 1 "not_debugable" hit Breakpoint 1, 0x0000555555bd3500 in kernel()
   <<<(1,1,1),(5,1,1)>>> ()
(cuda-gdb) next
Single stepping until exit from function _Z6kernelv,
which has no line number information.
kernel () at /home/nionios/git/test/cuda_gdb/libr/main.cu:8
8	}

I am using a newer CUDA version, so it my case it’s slightly different:

Type "apropos word" to search for commands related to "word"...
Reading symbols from not_debugable...
(cuda-gdb) b kernel
Breakpoint 1 at 0x7bb92: file /home/akravets/Downloads/cuda_gdb-master/libr/main.cu, line 5.
(cuda-gdb) run
Starting program: /home/akravets/Downloads/cuda_gdb-master/not_debugable 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff540b000 (LWP 1108879)]
[New Thread 0x7fffeffff000 (LWP 1108880)]
[Detaching after fork from child process 1108881]
[New Thread 0x7fffef2dd000 (LWP 1108891)]
[New Thread 0x7fffee438000 (LWP 1108892)]
[Switching focus to CUDA kernel 0, grid 1, block (0,0,0), thread (0,0,0), device 0, sm 0, warp 0, lane 0]

Thread 1 "not_debugable" hit Breakpoint 1, 0x00007fffc7036800 in kernel()<<<(1,1,1),(5,1,1)>>> ()
(cuda-gdb) next
Single stepping until exit from function _Z6kernelv,
which has no line number information.
0x00007ffff56ff338 in ?? () from /lib/x86_64-linux-gnu/libcuda.so.1

Anyway the main difference seems to be in the source mapping:

  • debuggable
Reading symbols from debugable...
(cuda-gdb) b kernel
Breakpoint 1 at 0xadcd: file /home/akravets/Downloads/cuda_gdb-master/libr/main.cu, line 5.
(cuda-gdb) r
Starting program: /home/akravets/Downloads/cuda_gdb-master/debugable 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff540b000 (LWP 1108981)]
[New Thread 0x7fffeffff000 (LWP 1108982)]
[Detaching after fork from child process 1108983]
[New Thread 0x7fffef2dd000 (LWP 1108993)]
[New Thread 0x7fffee438000 (LWP 1108994)]
[Switching focus to CUDA kernel 0, grid 1, block (0,0,0), thread (0,0,0), device 0, sm 0, warp 0, lane 0]

Thread 1 "debugable" hit Breakpoint 1, kernel<<<(1,1,1),(5,1,1)>>> () at /home/akravets/Downloads/cuda_gdb-master/libr/main.cu:6
6           const auto thread_id = threadIdx.x + blockDim.x * threadIdx.y;
(cuda-gdb) info source
Current source file is /home/akravets/Downloads/cuda_gdb-master/libr/main.cu
Compilation directory is /home/akravets/Downloads/cuda_gdb-maste
Located in /home/akravets/Downloads/cuda_gdb-master/libr/main.cu
Contains 14 lines.
Source language is c++.
Producer is lgenfe: EDG 6.4.
Compiled with DWARF 2 debugging format.
Does not include preprocessor macro info.
  • not_debuggable
Type "apropos word" to search for commands related to "word"...
Reading symbols from not_debugable...
(cuda-gdb) b kernel
Breakpoint 1 at 0x7bb92: file /home/akravets/Downloads/cuda_gdb-master/libr/main.cu, line 5.
(cuda-gdb) r
Starting program: /home/akravets/Downloads/cuda_gdb-master/not_debugable 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff540b000 (LWP 1109102)]
[New Thread 0x7fffeffff000 (LWP 1109103)]
[Detaching after fork from child process 1109104]
[New Thread 0x7fffef2dd000 (LWP 1109114)]
[New Thread 0x7fffee438000 (LWP 1109115)]
[Switching focus to CUDA kernel 0, grid 1, block (0,0,0), thread (0,0,0), device 0, sm 0, warp 0, lane 0]

Thread 1 "not_debugable" hit Breakpoint 1, 0x00007fffc7036800 in kernel()<<<(1,1,1),(5,1,1)>>> ()
(cuda-gdb) info source
No current source file.

We will work with the compiler team to get this resolved.

1 Like

@nionios
After further investigation we have found another possible solution. Please try adding the following to the top level CMakeLists.txt

set(CUDA_NVCC_FLAGS "-g -G")

E.g. as follows

include_directories(${CUDA_INCLUDE_DIRS})

set(CMAKE_CUDA_FLAGS ${CMAKE_CUDA_FLAGS} "-g -G")
set(CUDA_NVCC_FLAGS "-g -G")

add_subdirectory(libr)

and re-run the cmake, re-build the binaries.

1 Like

Wow works like a charm! Thanks a lot! What is the difference between CMAKE_CUDA_FLAGS and CUDA_NVCC_FLAGS? I thought I had already set the debugging flags for nvcc

You CMakeLists.txt uses a mix of deprecated FindCuda.cmake (which ignores CMAKE_CUDA_FLAGS) and new native cmake CUDA support (e.g. CMAKE_CUDA_FLAGS).

The FindCuda.cmake-based method uses CUDA_NVCC_FLAGS.

More details:

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.