Cuda-gdb script execution

Ziqi · March 23, 2023, 6:03am

I would like to write a script for cuda-gdb for automated debugging. The script I have is of the following form:

set pagination off
set logging file SOME_OUTPUT_FILE
set logging off

set cuda break_on_launch application

break SOME_FILE_PATH: SOME_LINE_NUMBER
    cuda kernel 0 block (0,0,0) thread (0,0,0)
    DO_SOMETHING
end

run

Essentially, I am trying to run something at each break point. However, I got Invalid coordinates. CUDA focus unchanged as I ran cuda-gdb -x SOME_SCRITP SOME_APP. What is the correct way to realize my purpose?

AKravets · March 28, 2023, 10:53am

Hi @Ziqi

You need to use the commands gdb command to have certain actions performed on each breakpoint hit (Break Commands (Debugging with GDB) ). E.g. your script might look as follows:

...
break SOME_FILE_PATH: SOME_LINE_NUMBER
commands 1
cuda kernel 0 block (0,0,0) thread (0,0,0)
DO_SOMETHING
end
...

Ziqi · March 28, 2023, 3:41pm

I compiled my code as shown in the following message:

Compiling ../ArrayMPA/src/mpa_utility.cu to make object file ../ArrayMPA/Debug/mpa_utility.x86-64.o.
-------------------------------------------------------------------
nvcc -x cu --compiler-options "-fPIC" --expt-relaxed-constexpr --gpu-architecture=sm_86 -Xcudafe --display_error_number -DPREFETCH=0 -DUSE_DCBX=0 -DX86 -D_LITTLE_ENDIAN_=1 -D__LITTLE_ENDIAN__=1  -DCPLUSPLUS_2011   -DDEBUG=2  -DDETECTION_DEBUG -DCHECKCUDAERROR -DGOOGLE_CUDA=1 -D_MWAITXINTRIN_H_INCLUDED -D_FORCE_INLINES -DUNIT_TEST=1  -g -G -O0  -I../../../../../../../../../../../../../usr/local/include  -I../ArrayMPA  -I../ArrayMPA/include  -I../../../../../../../../../../../../../usr/local/cuda/targets/x86_64-linux/include -I/usr/local/lib/python3.8/dist-packages/tensorflow/include -D_GLIBCXX_USE_CXX11_ABI=1 -DEIGEN_MAX_ALIGN_BYTES=64 -I ../../../../scripts/../3dparty/libpng -I ../../../../scripts/../3dparty/zlib -I../../../ops/CommonUtils//../..//shared -I../../../ops/CommonUtils//../..//ops/CommonUtils -I../../../ops/CommonUtils//../..//ops/CommonUtils/Png --keep --keep-dir ../ArrayMPA/Debug -c ../ArrayMPA/src/mpa_utility.cu -o ../ArrayMPA/Debug/mpa_utility.x86-64.o

Again I saw the following optimized-out message:

Thread 1 "TestMPA.x86-64" hit Breakpoint 1, generateCandidates_allCaAtOnce_idxCaFrameMerged_sharedMem_warpConvergence_knl<<<(8,8,48),(32,2,1)>>> (pTestFrames=0x702000000, numFrame=8, frameHeight=1520, frameWidth=896, pCareAreas=0x7320ca800, pPitchPerCA=0x7320cc800, pMaxNumCandidatesPerCA=0x7320cd000, pHamSincPerCA=0x732000000, numCA=6, truncNum=7, subCaHeight=64, pCandidatesPerPixel=0x73420cb40, pNumCandidatesPerPixel=0x7320ce200) at ../ArrayMPA/src/mpa_utility.cu:1374
1374            float pitch = pPitchPerCA[idxCA];
[Switching focus to CUDA kernel 0, grid 6, block (1,0,0), thread (1,0,0), device 0, sm 0, warp 0, lane 1]
1374            float pitch = pPitchPerCA[idxCA];
$1 = <optimized out>

What is weird is that this optimized-out message is random for the same executable, e.g., the message didn’t show up in my first run, and showed up in the second. My cuda-gdb script was written as follows:

set pagination off
set logging file gdb.output
set logging off

set breakpoint pending on

break /home/zfan/sandbox/Virgo-Algo-Container-3.0/Blazer/MercuryImageComputer/KT/leaf/Virgo/src/ops/RefGen/ArrayMPA/src/mpa_utility.cu:1374
commands 1
cuda kernel 0 block (1,0,0) thread (1,0,0)
print maxNumCandidates
end

and I ran it with

sudo cuda-gdb -x ../mpa_debug_script --args ./TestMPA.x86-64 ../debug_data frame_info.txt 1

As I entered nvcc -V, I got

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Jun__8_16:49:14_PDT_2022
Cuda compilation tools, release 11.7, V11.7.99
Build cuda_11.7.r11.7/compiler.31442593_0

Further, as I entered cuda-gdb --version, I got

NVIDIA (R) CUDA Debugger
11.7 release
Portions Copyright (C) 2007-2022 NVIDIA Corporation
GNU gdb (GDB) 10.2
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Is it a known bug in nvcc or cuda-gdb in CUDA 11.7?

AKravets · March 29, 2023, 7:57am

Ok, so looks like the scripting is working for you now, but there is an issue with <optimized out> displayed for certain variables.

Without having the repro (or at least the relevant kernel code) it’s hard to tell what exactly might got wrong here. So:

Can you try using newer CUDA version? Latest one is 12.1: https://developer.nvidia.com/cuda-downloads
Can you share the kernel code? It might be similar to the problem reported in Disable register optimizations / "value optimized out" problem

Ziqi · April 1, 2023, 3:58pm

Is there a container that supports the newest CUDA version and a corresponding Tensorflow? We not only need CUDA, but also a version of Tensorflow for our product.

As for the kernel, I cannot provide it because of corporate IP. Unfortunately, in most cases (except for learning cases of students), source code cannot be released on this forum. I guess Nvidia may need a way to reproduce the issue and investigate independently of customers/developers.

veraj · December 29, 2023, 2:33am

Hi, @Ziqi

Anything else we can help ？

veraj · January 29, 2024, 12:00am

This topic was automatically closed after 9 days. New replies are no longer allowed.

Topic		Replies	Views
cuda-gdb hang and compiled program spewing nonsense CUDA Programming and Performance	7	2253	February 15, 2011
cuda-gdb cannot break in device code CUDA Programming and Performance	2	1870	April 12, 2011
When using cuda-gdb for debugging in CLion, setting a breakpoint within a kernel function causes CLion to freeze CUDA-GDB	2	469	July 30, 2024
cuda-gdb misses breakpoints depending on "compute capability" cuda-gdb, breakpoint, misses CUDA Programming and Performance	3	5823	September 23, 2009
Try to Debug CUDA code launched by Python script (interfaced by Cython) Updated Nsight Visual Studio Code Edition cuda , nsight , python	5	3160	January 15, 2024
Anomalies with __device__ functions. Or is cuda-gdb playing stupid? CUDA Programming and Performance	0	3503	November 9, 2011
Cuda-gdb aborted CUDA-GDB	7	255	November 24, 2024
Cuda-gdb scripting CUDA-GDB	4	927	March 22, 2023
Should we expect cuda-gdb to repeatedly allocate and deallocate memory on the fly? CUDA-GDB	7	693	May 17, 2021
Correct output with emulation mode, wrong with GPU/Execution CUDA Programming and Performance	6	3340	March 25, 2010

Cuda-gdb script execution

Related topics