cuda-gdb in applications with textures

I’m happy to find that CUDA 3.0’s new debugging format compiles and links fine with CMake. Yay, now I can finally try to use when debugging my large app.

Or maybe not: the documentation says “Debugging applications using textures is not supp
orted.”

Does this mean that I cannot use cuda-gdb on any kernels because my application uses textures extensively? Or just that I cannot use cuda-gdb to debug those kernels that use textures?

I ask because it doesn’t seem to be working at all.
I’m attempting to set a breakpoint at the first line of a simple kernel that doesn’t use textures (though other kernels in the file do). When I run, I get a break on the last line of the kernel, on the CPU after the kernel has finished executing.

Smaller test applications compiled and built the same way will break in kernels just fine.

Can you give me a repro case? It should detect texture usage on a per-function basis and simply not debug those kernels, not on a per-module basis.

Thanks for the confirmation that it should be a per function limit. I’ll build up the complexity of my reproduction case and see if I can get the same behavior from that. Will post back w/ more info.

Alright, it is indeed textures per compilation unit that is the cause. I’ve got a minimal repro case:

For the record, I’m running this on

Ubuntu 9.04, 32-bit (I know, 32-bit: I didn’t install the OS on this thing)

CUDA 3.0

Driver: 195.36.15

I built a simple 2 cu and 1 cc file test prog. add.cu has two kernels in it: one uses texture and one does not. sub.cu only has a non-texure kernel. My main aim is to test the debugability of an app built with CMake, but I get the exact same behavior if it is compiled with just: $ nvcc -g -G add.cu sub.cu test.cc -o test

Here are abbreviated cuda-gdb sessions showing the behavior I get:

The first one places a breakpoint in the non-texture using kernel in add.cu.

$ cuda-gdb ./test

NVIDIA (R) CUDA Debugger

3.0 release

........

(cuda-gdb) break add.cu:11

Breakpoint 1 at 0x8053eb1: file add.cu, line 11.

(cuda-gdb) run

Starting program: /home/joaander/test/source/test 

[Thread debugging using libthread_db enabled]

[New process 8497]

[New Thread -1210673456 (LWP 8497)]

Breakpoint 1 at 0x857fd18: file add.cu, line 11.

Breakpoint 1 at 0x8053eb1: file add.cu, line 11.

[Switching to Thread -1210673456 (LWP 8497)]

Breakpoint 1, add_kernel (__cuda_0=0x110000, __cuda_1=256) at add.cu:14

14		}

(cuda-gdb) quit

The program is running.  Exit anyway? (y or n) y

Note how it incorrectly broke out at line 14 on the CPU, after the kernel completed. Not shown here, but at this point, I can run the continue command and continue execution correctly.

Next up: a breakpoint in sub.cu (the compilation unit without any textures) works perfectly:

$ cuda-gdb ./test

NVIDIA (R) CUDA Debugger

3.0 release

......

(cuda-gdb) break sub.cu:11

Breakpoint 1 at 0x8064256: file sub.cu, line 11.

(cuda-gdb) run

Starting program: /home/joaander/test/source/test 

[Thread debugging using libthread_db enabled]

[New process 8504]

[New Thread -1211652400 (LWP 8504)]

Breakpoint 1 at 0x858dd18: file sub.cu, line 11.

Breakpoint 1 at 0x8064256: file sub.cu, line 11.

Program received signal SIGTRAP, Trace/breakpoint trap.

[Switching to Thread -1211652400 (LWP 8504)]

[Current CUDA Thread <<<(0,0),(0,0,0)>>>]

add_kernel <<<(1,1),(256,1,1)>>> (data=0x110000, n=256) at add.cu:11

11			unsigned int cur = data[idx];

(cuda-gdb) step

[Current CUDA Thread <<<(0,0),(0,0,0)>>>]

add_kernel <<<(1,1),(256,1,1)>>> (data=0x110000, n=256) at add.cu:12

12			data[idx] = cur+idx;

(cuda-gdb) print cur

$1 = 0

(cuda-gdb) cuda thread(10,0,0

New CUDA focus: device 0, sm 0, warp 0, lane 10, grid 60000, block (0,0), thread (10,0,0).

(cuda-gdb) print cur

$2 = 10

(cuda-gdb) quit

The program is running.  Exit anyway? (y or n) y

Lastly, placing a breakpoint inside the kernel that uses a texture produces interesting results.

$ cuda-gdb ./test

NVIDIA (R) CUDA Debugger

3.0 release

....

Using host libthread_db library "/lib/tls/i686/cmov/libthread_db.so.1".

(cuda-gdb) break add.cu:27

Breakpoint 1 at 0x8053e26: file add.cu, line 27.

(cuda-gdb) run

Starting program: /home/joaander/test/source/test 

[Thread debugging using libthread_db enabled]

[New process 16158]

[New Thread -1211156784 (LWP 16158)]

Breakpoint 1 at 0xa8: file add.cu, line 27.

Breakpoint 1 at 0x8053e26: file add.cu, line 27.

[Switching to Thread -1211156784 (LWP 16158)]

Breakpoint 1, add_tex_kernel (__cuda_0=0x110000, __cuda_1=256) at add.cu:30

30		}

(cuda-gdb) continue

Continuing.

^C

Program received signal SIGINT, Interrupt.

0xb75cbcd0 in ?? () from /usr/lib/libcuda.so

(cuda-gdb) quit

The program is running.  Exit anyway? (y or n) y

^C^C

It again broke on the CPU at the end of the kernel where the breakpoint was placed. Note that the system froze up for several minutes with ./test using 100% CPU at the “Continuing.” line, forcing me to control-C.

Attaching a tarball with the source code.
test.tar.gz (872 Bytes)

Alright, it is indeed textures per compilation unit that is the cause. I’ve got a minimal repro case:

For the record, I’m running this on

Ubuntu 9.04, 32-bit (I know, 32-bit: I didn’t install the OS on this thing)

CUDA 3.0

Driver: 195.36.15

I built a simple 2 cu and 1 cc file test prog. add.cu has two kernels in it: one uses texture and one does not. sub.cu only has a non-texure kernel. My main aim is to test the debugability of an app built with CMake, but I get the exact same behavior if it is compiled with just: $ nvcc -g -G add.cu sub.cu test.cc -o test

Here are abbreviated cuda-gdb sessions showing the behavior I get:

The first one places a breakpoint in the non-texture using kernel in add.cu.

$ cuda-gdb ./test

NVIDIA (R) CUDA Debugger

3.0 release

........

(cuda-gdb) break add.cu:11

Breakpoint 1 at 0x8053eb1: file add.cu, line 11.

(cuda-gdb) run

Starting program: /home/joaander/test/source/test 

[Thread debugging using libthread_db enabled]

[New process 8497]

[New Thread -1210673456 (LWP 8497)]

Breakpoint 1 at 0x857fd18: file add.cu, line 11.

Breakpoint 1 at 0x8053eb1: file add.cu, line 11.

[Switching to Thread -1210673456 (LWP 8497)]

Breakpoint 1, add_kernel (__cuda_0=0x110000, __cuda_1=256) at add.cu:14

14		}

(cuda-gdb) quit

The program is running.  Exit anyway? (y or n) y

Note how it incorrectly broke out at line 14 on the CPU, after the kernel completed. Not shown here, but at this point, I can run the continue command and continue execution correctly.

Next up: a breakpoint in sub.cu (the compilation unit without any textures) works perfectly:

$ cuda-gdb ./test

NVIDIA (R) CUDA Debugger

3.0 release

......

(cuda-gdb) break sub.cu:11

Breakpoint 1 at 0x8064256: file sub.cu, line 11.

(cuda-gdb) run

Starting program: /home/joaander/test/source/test 

[Thread debugging using libthread_db enabled]

[New process 8504]

[New Thread -1211652400 (LWP 8504)]

Breakpoint 1 at 0x858dd18: file sub.cu, line 11.

Breakpoint 1 at 0x8064256: file sub.cu, line 11.

Program received signal SIGTRAP, Trace/breakpoint trap.

[Switching to Thread -1211652400 (LWP 8504)]

[Current CUDA Thread <<<(0,0),(0,0,0)>>>]

add_kernel <<<(1,1),(256,1,1)>>> (data=0x110000, n=256) at add.cu:11

11			unsigned int cur = data[idx];

(cuda-gdb) step

[Current CUDA Thread <<<(0,0),(0,0,0)>>>]

add_kernel <<<(1,1),(256,1,1)>>> (data=0x110000, n=256) at add.cu:12

12			data[idx] = cur+idx;

(cuda-gdb) print cur

$1 = 0

(cuda-gdb) cuda thread(10,0,0

New CUDA focus: device 0, sm 0, warp 0, lane 10, grid 60000, block (0,0), thread (10,0,0).

(cuda-gdb) print cur

$2 = 10

(cuda-gdb) quit

The program is running.  Exit anyway? (y or n) y

Lastly, placing a breakpoint inside the kernel that uses a texture produces interesting results.

$ cuda-gdb ./test

NVIDIA (R) CUDA Debugger

3.0 release

....

Using host libthread_db library "/lib/tls/i686/cmov/libthread_db.so.1".

(cuda-gdb) break add.cu:27

Breakpoint 1 at 0x8053e26: file add.cu, line 27.

(cuda-gdb) run

Starting program: /home/joaander/test/source/test 

[Thread debugging using libthread_db enabled]

[New process 16158]

[New Thread -1211156784 (LWP 16158)]

Breakpoint 1 at 0xa8: file add.cu, line 27.

Breakpoint 1 at 0x8053e26: file add.cu, line 27.

[Switching to Thread -1211156784 (LWP 16158)]

Breakpoint 1, add_tex_kernel (__cuda_0=0x110000, __cuda_1=256) at add.cu:30

30		}

(cuda-gdb) continue

Continuing.

^C

Program received signal SIGINT, Interrupt.

0xb75cbcd0 in ?? () from /usr/lib/libcuda.so

(cuda-gdb) quit

The program is running.  Exit anyway? (y or n) y

^C^C

It again broke on the CPU at the end of the kernel where the breakpoint was placed. Note that the system froze up for several minutes with ./test using 100% CPU at the “Continuing.” line, forcing me to control-C.

Attaching a tarball with the source code.

Thanks a bunch, I will throw this at the debugger guys.

Thanks a bunch, I will throw this at the debugger guys.

Will debugging of kernels that use textures be supported prior to eliminating device emulation?

Will debugging of kernels that use textures be supported prior to eliminating device emulation?