hi,
I’m trying out Nsight eclipse version on my host PC (x86_64 Ubuntu VM) to remote debug CUDA programs on TX2. I got as far as starting the remote cuda program running from Nsight, but I failed to debug it.
it first print below log, which look normal for a remote debug session.
Last login: Fri Jun 9 18:00:20 2017 from 192.168.31.26
echo $PWD'>'
/bin/sh -c "cd \"/home/nvidia/test/Debug\";export NVPROF_TMPDIR=\"/tmp\";\"/usr/local/cuda-8.0/bin/cuda-gdbserver\" --cuda-use-lockfile=0 :2345 \"/home/nvidia/test/Debug/test\"";exit
nvidia@tegra-ubuntu:~$ echo $PWD'>'
/home/nvidia>
nvidia@tegra-ubuntu:~$ /bin/sh -c "cd \"/home/nvidia/test/Debug\";export NVPROF_TMPDIR=\"/tmp\";\"/usr/local/cuda-8.0/bin/cuda-gdbserver\" --cuda-use-lockfile=0 :2345 \"/home/nvidia/test/Debug/test\"";exit
Process /home/nvidia/test/Debug/test created; pid = 7172
Listening on port 2345
Remote debugging from host 192.168.31.26
Then, after freezing at this stage for a really long time (few minutes), it finally comes to the debug perspective and it shoot out this log, which I don’t understand.
Coalescing of the CUDA commands output is off.
$1 = 0xff
The target endianness is set automatically (currently little endian)
I read some earlier topics about this, some said it takes a dual-GPU target device to debug on GPU, because the current using GPU cannot be halted. Is that the reason I failed to debug? Or if not, any other reasons?
the SSH must have been working, because I successfully started a cuda sample on target from host. As for the guide, yes I followed most part of it. I skipped setting the cross compiler configuration, and chose synchronized project mode.
Yes, I installed everything through JetPack-L4T-3.0-linux-x64.run. Cuda version is V8.0.62
Here is the console output of remote run in Debug profile. This is a cross compiled project using cuda sample code matrixMul.
Last login: Tue Jun 13 15:16:44 2017 from 192.168.31.80
echo $PWD'>'
/bin/sh -c "cd \"/home/nvidia/test/Debug\";export LD_LIBRARY_PATH=\"/usr/local/cuda-8.0/lib64\":\${LD_LIBRARY_PATH};\"/home/nvidia/test/Debug/test\"";exit
nvidia@tegra-ubuntu:~$ echo $PWD'>'
/home/nvidia>
nvidia@tegra-ubuntu:~$ /bin/sh -c "cd \"/home/nvidia/test/Debug\";export LD_LIBRARY_PATH=\"/usr/local/cuda-8.0/lib64\":\${LD_LIBRARY_PATH};\"/home/nvidia/test/Debug/test\"";exit
[Matrix Multiply Using CUDA] - Starting...
GPU Device 0: "GP10B" with compute capability 6.2
MatrixA(320,320), MatrixB(640,320)
Computing result using CUDA Kernel...
done
Performance= 1.57 GFlop/s, Time= 83.316 msec, Size= 131072000 Ops, WorkgroupSize= 1024 threads/block
Checking computed result for correctness: Result = PASS
NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.
logout
I also tested the command line remote gdb. Although very slow, it works.
on remote target:
/usr/local/cuda/bin/cuda-gdbserver :8080 test
Process test created; pid = 3565
Listening on port 8080
on host machine:
xliu@ubuntu:~/work/cuda/test/Debug$ cuda-gdb test
NVIDIA (R) CUDA Debugger
8.0 release
Portions Copyright (C) 2007-2016 NVIDIA Corporation
GNU gdb (GDB) 7.6.2
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/xliu/work/cuda/test/Debug/test...done.
(cuda-gdb) target remote 192.168.31.37:8080
Remote debugging using 192.168.31.37:8080
warning: Unable to find dynamic linker breakpoint function.
GDB will be unable to debug shared library initializers
and track explicitly loaded dynamic code.
0x0000007fb7fd2d80 in ?? ()
(cuda-gdb)
(cuda-gdb) b matrixMul.cu:364
Breakpoint 1 at 0x404654: file ../src/matrixMul.cu, line 364.
(cuda-gdb) c
Continuing.
warning: Could not load shared library symbols for 8 libraries, e.g. /lib/aarch64-linux-gnu/librt.so.1.
Use the "info sharedlibrary" command to see the complete listing.
Do you need "set solib-search-path" or "set sysroot"?
Breakpoint 1, main (argc=1, argv=0x7fffffef78) at ../src/matrixMul.cu:364
364 if (checkCmdLineFlag(argc, (const char **)argv, "help") ||
(cuda-gdb) p argc
$1 = 1
(cuda-gdb) next
376 int devID = 0;
(cuda-gdb)
378 if (checkCmdLineFlag(argc, (const char **)argv, "device"))
(cuda-gdb) p devID
$2 = 0
(cuda-gdb)
We need more information to find out the root cause. Please helps to provide:
You can attach files from the attachment button at the upper right corner of a posted comment.
1. cuda-gdb traces:
Go to Nsight EE console view during debug session hang → Click on TV like icon drop down-> Select gdb traces option-> copy the traces from console view. 2. Screen shot of Nsight EE debug perspective 3] Nsight EE log: $workspace/.metadata/.log
From the attached cuda-gdb traces, we see that the breakpoint is set on the main function but the breakpoint was never hit. Hence the program keeps running in Nsight.
We keep tracking this.
More,
How do you connect to the device (wifi or ethernet)?
Please try connecting to Ethernet(if not already) and also try waiting for a longer time.