I am following the cuda-gdb walkthrough at http://developer.download.nvidia.com/compute/DevZone/docs/html/C/doc/cuda-gdb.pdf (chapter 011), and am not able to single step through the code:
% cuda-gdb bitreverse
NVIDIA (R) CUDA Debugger
4.0 release
Portions Copyright (C) 2007-2011 NVIDIA Corporation
GNU gdb 6.3.50.20050815-cvs (Fri May 13 10:38:44 UTC 2011)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "--host=i686-apple-darwin10.0.0 --target="...unable to read unknown load command 0x24
unable to read unknown load command 0x26
unable to read unknown load command 0x24
unable to read unknown load command 0x26
unable to read unknown load command 0x24
unable to read unknown load command 0x26
unable to read unknown load command 0x24
unable to read unknown load command 0x26
unable to read unknown load command 0x24
unable to read unknown load command 0x26
unable to read unknown load command 0x24
unable to read unknown load command 0x26
unable to read unknown load command 0x24
unable to read unknown load command 0x26
Reading symbols for shared libraries ... done
unable to read unknown load command 0x24
unable to read unknown load command 0x26
(cuda-gdb) b bitreverse
Breakpoint 1 at 0x281c: file tmpxft_00007d16_00000000-1_bitreverse.cudafe1.stub.c, line 8.
(cuda-gdb) r
Starting program: /Users/geordan/src/cuda/bitreverse
Reading symbols for shared libraries + done
unable to read unknown load command 0x24
unable to read unknown load command 0x26
(lots of this)
Reading symbols for shared libraries +.++........................................................................................................ done
Reading symbols for shared libraries .. done
Reading symbols for shared libraries .. done
[Context Create of context 0x69980e00 on Device 0]
[Launch of CUDA Kernel 0 (bitreverse<<<(1,1,1),(256,1,1)>>>) on Device 0]
[Switching focus to CUDA kernel 0, grid 1, block (0,0,0), thread (0,0,0), device 0, sm 0, warp 0, lane 0]
Breakpoint 1, bitreverse<<<(1,1,1),(256,1,1)>>> (data=0x110000) at bitreverse.cu:9
9 unsigned int *idata = (unsigned int*)data;
(cuda-gdb) bt
#0 bitreverse<<<(1,1,1),(256,1,1)>>> (data=0x110000) at bitreverse.cu:9
(cuda-gdb) next
[Termination of CUDA Kernel 0 (bitreverse<<<(1,1,1),(256,1,1)>>>) on Device 0]
[Switching to process 32169]
0x99b959c7 in __mtx_droplock ()
Why is the debugger allowing the kernel to exit instead of stepping through?
This is on OS X 10.7 preview, CUDA 4.0.19, toolkit 4.0.17.