Cannot debug cuda application

alevincio · July 5, 2010, 9:11am

Hi,

I’m having a debugging problem and I cannot understand what is happening. Take as example the trivial code

attached in the file testkernel.cu.

I compile this source file with “nvcc -arch=sm_13 -g -G testkernel.cu -o testkernel” and if I execute it I get the

expected result but when i try to debug it with cuda-gdb the debugging process remains blocked when calling

cudaMalloc. Here you can see what i get from my terminal:

[codebox]

user@server:~$ ./testkernel

0 0

1 3

2 6

3 9

4 12

5 15

6 18

7 21

8 24

9 27

user@server:~$ cuda-gdb testkernel

NVIDIA ® CUDA Debugger

BETA release

GNU gdb 6.6

GDB is free software, covered by the GNU General Public License, and you are

welcome to change it and/or distribute copies of it under certain conditions.

Type “show copying” to see the conditions.

There is absolutely no warranty for GDB. Type “show warranty” for details.

This GDB was configured as “x86_64-unknown-linux-gnu”…

Using host libthread_db library “/lib/libthread_db.so.1”.

(cuda-gdb) break main

Breakpoint 1 at 0x417c97: file testkernel.cu, line 13.

(cuda-gdb) run

Starting program: /home/vincenzi/testkernel

Breakpoint 1 at 0x417c8b: file testkernel.cu, line 11.

Breakpoint 1 at 0x417c97: file testkernel.cu, line 13.

[Thread debugging using libthread_db enabled]

[New process 13431]

[New Thread 140190674859776 (LWP 13431)]

[Switching to Thread 140190674859776 (LWP 13431)]

Breakpoint 1, main () at testkernel.cu:13

13 cudaError error = cudaSetDevice(0);

Current language: auto; currently c++

(cuda-gdb) next

Warning: a GPU was made unavailable to the application due to debugging

constraints. This may change the application behaviour!

15 if (error != cudaSuccess)

(cuda-gdb) next

22 if ( cudaMalloc ((void **) &device_x, N * sizeof(double)) != cudaSuccess)

(cuda-gdb) next

^C

Program received signal SIGINT, Interrupt.

0x00007f80adc27e48 in ?? () from /usr/lib/libcuda.so

(cuda-gdb)

[/codebox]

It remains blocked and nothing happens until I force termination with the CNTRL-C signal. What is happening?

Why the normal execution works fine but the debugging doesn’t? I’m actually running the code on a Linux server

with two GEOFORCE GTX 295:

[codebox]

user@server:~$ ./NVIDIA_GPU_Computing_SDK/C/bin/linux/release/deviceQuery

CUDA Device Query (Runtime API) version (CUDART static linking)

There are 2 devices supporting CUDA

Device 0: “GeForce GTX 295”

CUDA Driver Version: 2.30

CUDA Runtime Version: 2.30

CUDA Capability Major revision number: 1

CUDA Capability Minor revision number: 3

Total amount of global memory: 938803200 bytes

Number of multiprocessors: 30

Number of cores: 240

Total amount of constant memory: 65536 bytes

Total amount of shared memory per block: 16384 bytes

Total number of registers available per block: 16384

Warp size: 32

Maximum number of threads per block: 512

Maximum sizes of each dimension of a block: 512 x 512 x 64

Maximum sizes of each dimension of a grid: 65535 x 65535 x 1

Maximum memory pitch: 262144 bytes

Texture alignment: 256 bytes

Clock rate: 1.24 GHz

Concurrent copy and execution: Yes

Run time limit on kernels: Yes

Integrated: No

Support host page-locked memory mapping: Yes

Compute mode: Default (multiple host threads can use this device simultaneously)

Device 1: “GeForce GTX 295”

CUDA Driver Version: 2.30

CUDA Runtime Version: 2.30

CUDA Capability Major revision number: 1

CUDA Capability Minor revision number: 3

Total amount of global memory: 939261952 bytes

Number of multiprocessors: 30

Number of cores: 240

Total amount of constant memory: 65536 bytes

Total amount of shared memory per block: 16384 bytes

Total number of registers available per block: 16384

Warp size: 32

Maximum number of threads per block: 512

Maximum sizes of each dimension of a block: 512 x 512 x 64

Maximum sizes of each dimension of a grid: 65535 x 65535 x 1

Maximum memory pitch: 262144 bytes

Texture alignment: 256 bytes

Clock rate: 1.24 GHz

Concurrent copy and execution: Yes

Run time limit on kernels: No

Integrated: No

Support host page-locked memory mapping: Yes

Compute mode: Default (multiple host threads can use this device simultaneously)

Test PASSED

Press ENTER to exit…

user@server:~$

[/codebox]

Thank u in advance ,

Alessandro Vincenzi

Topic		Replies	Views
Cuda-gdb doesn't break and/or step into Kernels CUDA Programming and Performance	26	54197	August 1, 2011
cuda-gdb hang and compiled program spewing nonsense CUDA Programming and Performance	7	2344	February 15, 2011
Cuda-GDB doesn't hit breakpoints inside kernel/ if the kernel is in a library and then linked to the executable CUDA-GDB vscode , cuda-gdb	9	3323	April 13, 2023
cuda-gdb CUDA Programming and Performance	2	8235	January 7, 2010
cuda-gdb cannot break in device code CUDA Programming and Performance	2	1931	April 12, 2011
cuda-gdb debugger stalls CUDA Programming and Performance	1	842	March 18, 2012
CUDA GDB hang on cudamalloc(), single GPU CUDA-GDB	6	2856	May 14, 2018
Break points ignored and does not step into cuda Kernels. CUDA-GDB	2	1361	August 7, 2017
newbie struggling to get cuda-gdb to run example is CUDA-GDB user manual Problem getting cuda-gdb to CUDA Programming and Performance	1	3946	November 1, 2011
CUDA 5 Debugging Mode CUDA Programming and Performance	9	2404	July 1, 2012

Cannot debug cuda application

Related topics