Ntch, i’m trying to make a simple program but with a simple example it works External Media
I’m currently have the cuda code into a shared library. The .cu file is compiled with -g -G --compiler-options = ‘-fPIC’, the rest of .cpp files with g++, then all of them are linked with nvcc -g -G --compiler-options = ‘-fPIC’ -shared.
int block_offset_cols = __mul24(blockIdx.x, blockDim.x);
int block_offset_rows = __mul24(blockIdx.y, blockDim.y);
int array_start = block_offset_rows * array_width + block_offset_cols;
int me = array_start + __mul24(array_width, threadIdx.y) + threadIdx.x;
}[/codebox]
[codebox]
Breakpoint 2, gpu_kernel () at gpu_kernel.cu:246
246 int block_offset_cols = __mul24(blockIdx.x, blockDim.x);
Current language: auto; currently c++
… a few nexts to achieve the bottom …
(cuda-gdb) print block_offset_cols
$1 = 0
(cuda-gdb) print me
$2 = 0
(cuda-gdb) print array_start
$3 = 0
(cuda-gdb) print array_width
$4 = 2
(cuda-gdb) print array_height
$5 = 2
(cuda-gdb) print blockIdx
$6 = {x = 0, y = 0}
(cuda-gdb) print threadIdx
$7 = {x = 0, y = 0, z = 0}
(cuda-gdb)
[/codebox]
All goes well, but, when i’m try to put a breakpoint in the device func that is in the same library (i try to put it when the shared library is already loaded)
Edit: Another thing, for developers. When you compile without -G option, run cuda-gdb, and try to put a breakpoint into a file that is part of a library but it isn’t in your current folder, you get a beautiful Segmentation fault External Media
Edit2: More info, i’m using centos linux 64 bits, g++ 4.1.2 20080704 (Red Hat 4.1.2-44), Tesla C1060
Post a bug report while you are waiting (that is, run nvidia-bug-report.sh and attach the generated file) and hope to get a reply from someone who knows more.
WAIT!!!
It may not be a bug! The device function you have posted does nothing on the data you declared inside. Even if the compiler is set to optimization level 0 it will not insert unused variables in your code that implies that you cannot see them while debugging. I assume that the code you posted was just a sample to check the feature out and not a way to declare variables into a kernel for external use(sorry I didn’t look too well at the code you posted). Try and add or compare the two vars and return the comparision value (something like return t0 == t1) and try again.
Sorry for being so late… :(