Hello,
I need help debugging GPU code using Nsight Eclipse Edition (NEE).
Here are the details of my setup:
Host:
- Ubuntu 16.04
- JetPack 4.3; CUDA 10.0
- NEE v10.0
Target:
- Jetson AGX Xavier
I am cross-building for the ARM platform from my host PC.
I am able to do the following:
- Cross-build applications (a simple hello world using CPU + GPU, as well as samples provided in the samples directory) for the target from the host, using NEE
- Run the cross-built application on the target from the host, using NEE; works as expected
- Single-step CPU code on the NEE, including setting breakpoints
- Debug CPU and GPU code using the command line (cuda-gdb; not using EE)
I am NOT able to do the following:
- Single-step GPU code using NEE.
- I can put breakpoints in the CUDA code, but when I resume debugging from the CPU code, the screen freezes and control is lost. I cannot even terminate the debugging session cleanly.
I understand that “Debugging a CUDA GPU involves pausing that GPU. When the graphics desktop manager is running on the same GPU, then debugging that GPU freezes the GUI and makes the desktop unusable”
So I have disabled the GUI completely on the target and am logging into the Jetson remotely using the console.
I have tried 2 ways to put the Jetson in console only mode:
$ sudo systemctl set-default multi-user.target
And
By changing the run level.
It did not make any difference to the outcome.
I also made the following changes, with no difference to the outcome:
- Set DISPLAY to “:0” in the Environment variable in NEE
- Disabled timeouts in /sys/kernel/debug/gpu.0/timeouts_enabled
- Enabled the “CUDA software preemption debugging” option in NEE
Please see below the console log from NEE and attached is a screenshot of the debugging perspective.
Let me know what I am missing.
Thanks,
Mithun
==========================================================================================
<< When debugger first starts up, here is the console output in NEE >>
#############################################################################
Coalescing of the CUDA commands output is off.
warning: “remote:” is deprecated, use “target:” instead.
warning: sysroot set to “target://”.
Reading /lib/ld-linux-aarch64.so.1 from remote target…
warning: File transfers from remote targets can be slow. Use “set sysroot” to access files locally instead.
Reading /lib/ld-linux-aarch64.so.1 from remote target…
Reading /lib/ld-2.27.so from remote target…
Reading /lib/.debug/ld-2.27.so from remote target…
0x0000007fb7fd31c0 in ?? () from target:/lib/ld-linux-aarch64.so.1
$1 = 0xff
The target endianness is set automatically (currently little endian)
Reading /lib/aarch64-linux-gnu/librt.so.1 from remote target…
Reading /lib/aarch64-linux-gnu/libpthread.so.0 from remote target…
Reading /lib/aarch64-linux-gnu/libdl.so.2 from remote target…
Reading /usr/lib/aarch64-linux-gnu/libstdc++.so.6 from remote target…
Reading /lib/aarch64-linux-gnu/libgcc_s.so.1 from remote target…
Reading /lib/aarch64-linux-gnu/libc.so.6 from remote target…
Reading /lib/aarch64-linux-gnu/libm.so.6 from remote target…
Reading /lib/aarch64-linux-gnu/librt-2.27.so from remote target…
Reading /lib/aarch64-linux-gnu/.debug/librt-2.27.so from remote target…
Reading /lib/aarch64-linux-gnu/47f37309461cc15fb1915bc198d718017a1f87.debug from remote target…
Reading /lib/aarch64-linux-gnu/.debug/47f37309461cc15fb1915bc198d718017a1f87.debug from remote target…
Reading /lib/aarch64-linux-gnu/libdl-2.27.so from remote target…
Reading /lib/aarch64-linux-gnu/.debug/libdl-2.27.so from remote target…
Reading /usr/lib/aarch64-linux-gnu/d7646e96801c7eed3642d3c10e301e0f3ea553.debug from remote target…
Reading /usr/lib/aarch64-linux-gnu/.debug/d7646e96801c7eed3642d3c10e301e0f3ea553.debug from remote target…
Reading /lib/aarch64-linux-gnu/4bfa7077953acb0e38a1039923ecb5fe9f6a62.debug from remote target…
Reading /lib/aarch64-linux-gnu/.debug/4bfa7077953acb0e38a1039923ecb5fe9f6a62.debug from remote target…
Reading /lib/aarch64-linux-gnu/libc-2.27.so from remote target…
Reading /lib/aarch64-linux-gnu/.debug/libc-2.27.so from remote target…
Reading /lib/aarch64-linux-gnu/libm-2.27.so from remote target…
Reading /lib/aarch64-linux-gnu/.debug/libm-2.27.so from remote target…
Temporary breakpoint 1, main () at …/src/hello1.cu:8
8 int main(void) {
Breakpoint 2, main () at …/src/hello1.cu:10
10 print_from_gpu<<<1,5>>>();
#############################################################################
<<< When I hit F8 to let it hit the breakpoint in the CUDA code, display freezes>>>
Reading /usr/lib/aarch64-linux-gnu/tegra/libcuda.so.1 from remote target…
Reading /usr/lib/aarch64-linux-gnu/tegra/libnvrm_gpu.so from remote target…
Reading /usr/lib/aarch64-linux-gnu/tegra/libnvrm.so from remote target…
Reading /usr/lib/aarch64-linux-gnu/tegra/libnvrm_graphics.so from remote target…
Reading /usr/lib/aarch64-linux-gnu/tegra/libnvidia-fatbinaryloader.so.32.3.1 from remote target…
Reading /usr/lib/aarch64-linux-gnu/tegra/libnvos.so from remote target…
Reading /usr/lib/aarch64-linux-gnu/tegra/libcuda.so.1.1.debug from remote target…
Reading /usr/lib/aarch64-linux-gnu/tegra/.debug/libcuda.so.1.1.debug from remote target…
Reading /usr/lib/aarch64-linux-gnu/tegra/libnvrm_gpu.so.debug from remote target…
Reading /usr/lib/aarch64-linux-gnu/tegra/.debug/libnvrm_gpu.so.debug from remote target…
Reading /usr/lib/aarch64-linux-gnu/tegra/libnvrm.so.debug from remote target…
Reading /usr/lib/aarch64-linux-gnu/tegra/.debug/libnvrm.so.debug from remote target…
Reading /usr/lib/aarch64-linux-gnu/tegra/libnvrm_graphics.so.debug from remote target…
Reading /usr/lib/aarch64-linux-gnu/tegra/.debug/libnvrm_graphics.so.debug from remote target…
Reading /usr/lib/aarch64-linux-gnu/tegra/libnvidia-fatbinaryloader.so.debug from remote target…
Reading /usr/lib/aarch64-linux-gnu/tegra/.debug/libnvidia-fatbinaryloader.so.debug from remote target…
Reading /usr/lib/aarch64-linux-gnu/tegra/libnvos.so.debug from remote target…
Reading /usr/lib/aarch64-linux-gnu/tegra/.debug/libnvos.so.debug from remote target…
#############################################################################
==========================================================================================