I compiled my code as shown in the following message:
Compiling ../ArrayMPA/src/mpa_utility.cu to make object file ../ArrayMPA/Debug/mpa_utility.x86-64.o.
-------------------------------------------------------------------
nvcc -x cu --compiler-options "-fPIC" --expt-relaxed-constexpr --gpu-architecture=sm_86 -Xcudafe --display_error_number -DPREFETCH=0 -DUSE_DCBX=0 -DX86 -D_LITTLE_ENDIAN_=1 -D__LITTLE_ENDIAN__=1 -DCPLUSPLUS_2011 -DDEBUG=2 -DDETECTION_DEBUG -DCHECKCUDAERROR -DGOOGLE_CUDA=1 -D_MWAITXINTRIN_H_INCLUDED -D_FORCE_INLINES -DUNIT_TEST=1 -g -G -O0 -I../../../../../../../../../../../../../usr/local/include -I../ArrayMPA -I../ArrayMPA/include -I../../../../../../../../../../../../../usr/local/cuda/targets/x86_64-linux/include -I/usr/local/lib/python3.8/dist-packages/tensorflow/include -D_GLIBCXX_USE_CXX11_ABI=1 -DEIGEN_MAX_ALIGN_BYTES=64 -I ../../../../scripts/../3dparty/libpng -I ../../../../scripts/../3dparty/zlib -I../../../ops/CommonUtils//../..//shared -I../../../ops/CommonUtils//../..//ops/CommonUtils -I../../../ops/CommonUtils//../..//ops/CommonUtils/Png --keep --keep-dir ../ArrayMPA/Debug -c ../ArrayMPA/src/mpa_utility.cu -o ../ArrayMPA/Debug/mpa_utility.x86-64.o
Again I saw the following optimized-out message:
Thread 1 "TestMPA.x86-64" hit Breakpoint 1, generateCandidates_allCaAtOnce_idxCaFrameMerged_sharedMem_warpConvergence_knl<<<(8,8,48),(32,2,1)>>> (pTestFrames=0x702000000, numFrame=8, frameHeight=1520, frameWidth=896, pCareAreas=0x7320ca800, pPitchPerCA=0x7320cc800, pMaxNumCandidatesPerCA=0x7320cd000, pHamSincPerCA=0x732000000, numCA=6, truncNum=7, subCaHeight=64, pCandidatesPerPixel=0x73420cb40, pNumCandidatesPerPixel=0x7320ce200) at ../ArrayMPA/src/mpa_utility.cu:1374
1374 float pitch = pPitchPerCA[idxCA];
[Switching focus to CUDA kernel 0, grid 6, block (1,0,0), thread (1,0,0), device 0, sm 0, warp 0, lane 1]
1374 float pitch = pPitchPerCA[idxCA];
$1 = <optimized out>
What is weird is that this optimized-out message is random for the same executable, e.g., the message didn’t show up in my first run, and showed up in the second. My cuda-gdb
script was written as follows:
set pagination off
set logging file gdb.output
set logging off
set breakpoint pending on
break /home/zfan/sandbox/Virgo-Algo-Container-3.0/Blazer/MercuryImageComputer/KT/leaf/Virgo/src/ops/RefGen/ArrayMPA/src/mpa_utility.cu:1374
commands 1
cuda kernel 0 block (1,0,0) thread (1,0,0)
print maxNumCandidates
end
and I ran it with
sudo cuda-gdb -x ../mpa_debug_script --args ./TestMPA.x86-64 ../debug_data frame_info.txt 1
As I entered nvcc -V
, I got
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Jun__8_16:49:14_PDT_2022
Cuda compilation tools, release 11.7, V11.7.99
Build cuda_11.7.r11.7/compiler.31442593_0
Further, as I entered cuda-gdb --version
, I got
NVIDIA (R) CUDA Debugger
11.7 release
Portions Copyright (C) 2007-2022 NVIDIA Corporation
GNU gdb (GDB) 10.2
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Is it a known bug in nvcc
or cuda-gdb
in CUDA
11.7?