Getting into a CUDA Subprogram why is the debuuger stepping over cuds subprograms?

newport_j · November 20, 2009, 8:52pm

// Simple 8-bit bit reversal Compute test

global void bitreverse(unsigned int *data)s
{
unsigned int *idata = data;

unsigned int x = idata[threadIdx.x];

x = ((0xf0f0f0f0 & x) >> 4) | ((0x0f0f0f0f & x) << 4);
x = ((0xcccccccc & x) >> 2) | ((0x33333333 & x) << 2);
x = ((0xaaaaaaaa & x) >> 1) | ((0x55555555 & x) << 1);

idata[threadIdx.x] = x;

}

int main(void)
{
unsigned int *d = NULL; int i;
unsigned int idata[N], odata[N];

for (i = 0; i < N; i++) 
    idata[i] = (unsigned int)i;

cudaMalloc((void**)&d, sizeof(int)*N);
cudaMemcpy(d, idata, sizeof(int)*N,
           cudaMemcpyHostToDevice);

bitreverse<<<1, N>>>(d);

cudaMemcpy(odata, d, sizeof(int)*N,
           cudaMemcpyDeviceToHost);

for (i = 0; i < N; i++)
    printf("%u -> %u\n", idata[i], odata[i]);

cudaFree((void*)d);
return 0;

}

The above code is from the CUDA-GDB manual V2.3, June 2009. It is designed so a student can step through the execution lines and understand what goes on in a CUDA program. I cannot get into the bitreverse subprogram. The command

break bitreverse

if it is the only break command in the program, does not stop the program’s execution in the bitreverse subprogram. What am I doing wrong? The CUDA-GDB manual states clearly that you cannot step over subroutines, yet that is apparently what I am doing. It seems even when I step one through the program one executable line at a time that I am still not going into the bitreverse subprogram. I want to learn abut CUDA code and that I why I want this. I already know about c code. It goes into c subprograms, but not CUDA subprograms.

I compile with

nvcc -g -G bitreverse.cu -o bitreverse

and then run

cuda -gdb bitreverse

then once in the debugger I type

break bitreverse

it responds as it says in the manual, but it still it does not go into CUDA subprograms.

why is this?

respectfully,

newport_j

avidday · November 20, 2009, 9:10pm

You have to start the program inside cuda-gdb. Try typing

run

newport_j · November 20, 2009, 9:25pm

I believe that is what i am doing. I compile

nvcc -g -G cuda.cu -o cuda

then

cuda-gdb cuda

then

breakpoint bitreverse

then

run

Note bitreverse is the CUDA subprogram. Now if I put a breakpoint over in the CUDA subprogram, it is ignored. it goes to all of the c subprograms and the main - not the CUDA subprograms.

I just do not know why.

Respectfully,

newport_j

avidday · November 20, 2009, 9:37pm

It works for me.

avid@cuda:~$ /opt/cuda/bin/nvcc -g -G yunker.cu -o yunker

avid@cuda:~$ LD_LIBRARY_PATH=/opt/cuda/lib64 /opt/cuda/bin/cuda-gdb ./yunker

NVIDIA (R) CUDA Debugger

BETA release

Portions Copyright (C) 2008,2009 NVIDIA Corporation

GNU gdb 6.6

Copyright (C) 2006 Free Software Foundation, Inc.

GDB is free software, covered by the GNU General Public License, and you are

welcome to change it and/or distribute copies of it under certain conditions.

Type "show copying" to see the conditions.

There is absolutely no warranty for GDB.  Type "show warranty" for details.

This GDB was configured as "x86_64-unknown-linux-gnu"...

Using host libthread_db library "/lib/libthread_db.so.1".

(cuda-gdb) list 

14	x = ((0xf0f0f0f0 & x) >> 4) | ((0x0f0f0f0f & x) << 4);

15	x = ((0xcccccccc & x) >> 2) | ((0x33333333 & x) << 2);

16	x = ((0xaaaaaaaa & x) >> 1) | ((0x55555555 & x) << 1);

17	

18	idata[threadIdx.x] = x;

19	}

20	

21	int main(void)

22	{

23	unsigned int *d = NULL; int i;

(cuda-gdb) 

24	unsigned int idata[N], odata[N];

25	

26	for (i = 0; i < N; i++)

27	idata[i] = (unsigned int)i;

28	

29	cudaMalloc((void**)&d, sizeof(int)*N);

30	cudaMemcpy(d, idata, sizeof(int)*N,

31	cudaMemcpyHostToDevice);

32	

33	bitreverse<<<1, N>>>(d);

(cuda-gdb) break bitreverse

Breakpoint 1 at 0x40ee29: file yunker.cu, line 10.

(cuda-gdb) run

Starting program: /home/avid/yunker 

[Thread debugging using libthread_db enabled]

[New process 32208]

[New Thread 140040012605184 (LWP 32208)]

Warning: a GPU was made unavailable to the application due to debugging

constraints.  This may change the application behaviour!

[Switching to Thread 140040012605184 (LWP 32208)]

[Current CUDA Thread <<<(0,0),(0,0,0)>>>]

Breakpoint 1, bitreverse () at yunker.cu:10

10	unsigned int *idata = data;

Current language:  auto; currently c++

(cuda-gdb) list

5	

6	#define N 256

7	

8	__global__ void bitreverse(unsigned int *data)

9	{

10	unsigned int *idata = data;

11	

12	unsigned int x = idata[threadIdx.x];

13	

14	x = ((0xf0f0f0f0 & x) >> 4) | ((0x0f0f0f0f & x) << 4);

(cuda-gdb)

The GPU you run cuda-gdb on must not have a display manager running on it. Are you trying to run the debugger on a card shared with X11?

frea · November 20, 2009, 11:29pm

It happens to me too, a quick workaround is to break on line numbers, that is type

b filename:linenumber

into gdb.

Topic		Replies	Views
Cuda-gdb doesn't break and/or step into Kernels CUDA Programming and Performance	26	53671	August 1, 2011
Problems with cuda-gdb V3.2 Release got can't read symbols using bitreverse.cu example form cuda CUDA Programming and Performance	0	6711	April 8, 2011
newbie struggling to get cuda-gdb to run example is CUDA-GDB user manual Problem getting cuda-gdb to CUDA Programming and Performance	1	3891	November 1, 2011
Anomalies with __device__ functions. Or is cuda-gdb playing stupid? CUDA Programming and Performance	0	3499	November 9, 2011
cuda-gdb hang and compiled program spewing nonsense CUDA Programming and Performance	7	2247	February 15, 2011
Assertion failure at ../nvGpuDebugServer.c Debuggin Problem CUDA Programming and Performance	4	4173	March 11, 2009
cuda-gdb cannot break in device code CUDA Programming and Performance	2	1842	April 12, 2011
CUDA-GDB captured "Illegal access to address" exception when I invoke child kernel, but the result is correct when free run CUDA Programming and Performance	6	1711	March 20, 2017
Walking through 2.3 Examlpes programs source code Stopping at subprogram's source code CUDA Programming and Performance	8	4583	January 15, 2010
CUDA-GDB fatal: All CUDA devices are used for X11 and cannot be used while deb CUDA Programming and Performance	1	6427	December 28, 2011

Getting into a CUDA Subprogram why is the debuuger stepping over cuds subprograms?

Related topics