cuda-gdb and 32bits compilation on a 64 bits Linux Is it possible?

Hi there,

I’m trying to use cuda-gdb (32 bits, from CUDA 2.1) on a 64 bits Linux machine.

I have a proper multilib environment, I can compile all my stuff in 32 bits with -m32 for gcc and nvcc without any problem, but I have problem with “nvcc -g -G”, which is needed by cuda-gdb.

Apparently the -m32 option is not passed correctly to ptxas nor to the linker, and I had to add “-Xptxas -m32” to get ptxas working. For the linker, it’s even worse: adding “-Xlinker -melf_i386” is not enough, and I had to write a little script that is before “ld” on my path and that calls the real /usr/bin/ld with the proper parameters to make it work in 32 bits mode.

But now I’m stuck to another error during the final link. Undefined references to things in the symbol table apparently…

Here is what I get using “nvcc -g -G -v”:

[codebox]

nvcc --compiler-bindir=gcc-4.1 --ptxas-options=-v -m32 -O0 -g -G -v -Xptxas -m32 -Xlinker -melf_i386 -c -o kernels_bicg.o kernels_bicg.cu

#$ ptxas --key=“b7df5b781a61e779” -arch=sm_10 --debug-info “/tmp/tmpxft_00006332_00000000-6_kernels_bicg.sm_10.cubin.dwarf.s” --translation-map “/tmp/tmpxft_00006332_00000000-6_kernels_bicg.sm_10.cubin.ptxmap” --link-info “kernels_bicg.linkinfo” -v -m32 --dont-merge-basicblocks --return-at-end “/tmp/tmpxft_00006332_00000000-2_kernels_bicg.ptx” -o “/tmp/tmpxft_00006332_00000000-6_kernels_bicg.sm_10.cubin”

#$ ld -r -o “kernels_bicg.o” “/tmp/tmpxft_00006332_00000000-11_tmpxft_00006332_00000000-6_kernels_bicg.sm_10.cubin.dwarf.o” “/tmp/tmpxft_00006332_00000000-14_kernels_bicg.o”

g++ -Wall -m32 -L/home/me/lib32 -L/emul/ia32-linux/usr/lib -lm -lcudart -lcublas -g -o my_app all_my_files.o kernels_bicg.o

kernels_bicg.o: In function `$$SymbolTable’:

(.nv11Segment+0x16c): undefined reference to `$gpu_registers’

kernels_bicg.o: In function `$$SymbolTable’:

(.nv11Segment+0x244): undefined reference to `blockDim’

kernels_bicg.o: In function `$$SymbolTable’:

(.nv11Segment+0x250): undefined reference to `gridDim’

kernels_bicg.o: In function `$$SymbolTable’:

(.nv11Segment+0x25c): undefined reference to `blockIdx’

kernels_bicg.o: In function `$$SymbolTable’:

(.nv11Segment+0x268): undefined reference to `threadIdx’

collect2: ld returned 1 exit status

make: *** [solver] Error 1

[/codebox]

gcc/g++ version is 4.3 (with gcc 4.1 installed in a subdir and used by nvcc as gcc 4.3 is not supported by nvcc)

Does someone know how I could fix that? Is there a nice way that doesn’t involve launching 42 commands each with 17 parameters?

Thanks in advance! :)

You won’t be able to debug it anyway. Just wait for 2.2.

I’m seeing the same problem as Schnouki, above. I’m using CUDA 2.2. Any updates/solutions/suggestions available?

Thx.

use nvcc to link (with -G) as well

Thanks! I can now use cuda-gdb to step thru the ‘template’ example. (Seems like common.mk should be set up such that gmake dbg=1 does this correctly.)

However, when I compile my code with the -G flag (and -Xptxas -v to show memory usage) I get the following error:

[codebox]$ /usr/local/cuda/bin/nvcc -Xptxas -v -D_DEBUG -I. -I/usr/local/cuda/include -I…/…/common/inc -DUNIX -g -G -o obj/debug/mycode.cu.o -c mycode.cu

ptxas info : Compiling entry function ‘_Z15KernelPhj’

ptxas info : Used 16 registers, 32556+27584 bytes lmem, 32+32 bytes smem, 16 bytes cmem[14]

ptxas error : Entry function ‘_Z15KernelPhj’ uses too much local data (0x136c bytes + 0x6bc0 bytes system, 0x4000 max)[/codebox]

I’ve compiled the release version of my code with the -Xptxas -v flag as well as running it with the profiler and it shows no local memory usage. I assume the compiler is creating local memory storage for the symbol table for the debugger when the -G flag is used. Any thoughts on how I can resolve this problem?