I got a device core dump file, with no symbols in it, as -G was not used when compiling.
I want to add symbol information to cuda-gdb to make it possible to find the bugs (the bug cannot be reproduced easily).
I found that there’s a --symbols option for cuda-gdb. I tried recompiling the program with -G, and ran cuda-gdb with --symbols=my-new-progrom, and loaded the GPU code by command:
(cuda-gdb) target cudacore core_xxxx.nvcudmp
cuda-gdb hung when I tried to print global memory with symbols, like:
(cuda-gdb) p (xxx::some_type *) 0x7f648670d600
Any idea?
The version of cuda-gdb is 11.4.
Hi @heibaidaolx123
Thank you for your report! To help us identify the issue could you clarify a few things:
- Can you print the same global memory without adding the symbols?
p/x 0x7f648670d600
- What GPU are you using? Could you paste the
nvidia-smi
output?
- How did you obtain the address?
@AKravets
- Can you print the same global memory without adding the symbols?
(cuda-gdb) p/x 0x7f648670d600
$1 = 0x7f648670d600
- What GPU are you using? Could you paste the
nvidia-smi
output?
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.57.02 Driver Version: 470.57.02 CUDA Version: 11.4 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA A40 On | 00000000:01:00.0 Off | 0 |
| 0% 45C P0 118W / 300W | 28439MiB / 45634MiB | 74% Default |
| | | N/A |
- How did you obtain the address?
I got the address by CPU core dump.
Hi @heibaidaolx123,
Unfortunately the use-case you are describing is not supported. When re-compiling the program with -G
flag the compiler also disables some of the optimization passes, which results in a different binary generated (not counting the debug information), so it’s not possible to use symbols from binary, compiler with -G option for the binary, compiled without it.
--device-debug (-G)
Generate debug information for device code. Turns off all optimizations.
Don't use for profiling; use -lineinfo instead.
You would have to generate coredump for the program built with -G
.