using Nsight remote debug TX2 fails

I’m trying out Nsight eclipse version on my host PC (x86_64 Ubuntu VM) to remote debug CUDA programs on TX2. I got as far as starting the remote cuda program running from Nsight, but I failed to debug it.

it first print below log, which look normal for a remote debug session.

Last login: Fri Jun  9 18:00:20 2017 from
echo $PWD'>'
/bin/sh -c "cd \"/home/nvidia/test/Debug\";export NVPROF_TMPDIR=\"/tmp\";\"/usr/local/cuda-8.0/bin/cuda-gdbserver\" --cuda-use-lockfile=0 :2345 \"/home/nvidia/test/Debug/test\"";exit
nvidia@tegra-ubuntu:~$ echo $PWD'>'
nvidia@tegra-ubuntu:~$ /bin/sh -c "cd \"/home/nvidia/test/Debug\";export NVPROF_TMPDIR=\"/tmp\";\"/usr/local/cuda-8.0/bin/cuda-gdbserver\" --cuda-use-lockfile=0 :2345 \"/home/nvidia/test/Debug/test\"";exit
Process /home/nvidia/test/Debug/test created; pid = 7172
Listening on port 2345
Remote debugging from host

Then, after freezing at this stage for a really long time (few minutes), it finally comes to the debug perspective and it shoot out this log, which I don’t understand.

Coalescing of the CUDA commands output is off.
$1 = 0xff
The target endianness is set automatically (currently little endian)

I read some earlier topics about this, some said it takes a dual-GPU target device to debug on GPU, because the current using GPU cannot be halted. Is that the reason I failed to debug? Or if not, any other reasons?


Do you follow this page to set up Nsight?

More, could you try if ssh work properly inside the Ubuntu VM?

hi AastaLLL,

the SSH must have been working, because I successfully started a cuda sample on target from host. As for the guide, yes I followed most part of it. I skipped setting the cross compiler configuration, and chose synchronized project mode.


Did you install cuda-toolkit host via JetPack?
Could you share your host cuda version?

Another possible reason is related to some kind of traffic shaper configuration.
Could you use Debug Run and check if there is more error log?

Yes, I installed everything through Cuda version is V8.0.62

Here is the console output of remote run in Debug profile. This is a cross compiled project using cuda sample code matrixMul.

Last login: Tue Jun 13 15:16:44 2017 from
echo $PWD'>'
/bin/sh -c "cd \"/home/nvidia/test/Debug\";export LD_LIBRARY_PATH=\"/usr/local/cuda-8.0/lib64\":\${LD_LIBRARY_PATH};\"/home/nvidia/test/Debug/test\"";exit
nvidia@tegra-ubuntu:~$ echo $PWD'>'
nvidia@tegra-ubuntu:~$ /bin/sh -c "cd \"/home/nvidia/test/Debug\";export LD_LIBRARY_PATH=\"/usr/local/cuda-8.0/lib64\":\${LD_LIBRARY_PATH};\"/home/nvidia/test/Debug/test\"";exit
[Matrix Multiply Using CUDA] - Starting...
GPU Device 0: "GP10B" with compute capability 6.2

MatrixA(320,320), MatrixB(640,320)
Computing result using CUDA Kernel...
Performance= 1.57 GFlop/s, Time= 83.316 msec, Size= 131072000 Ops, WorkgroupSize= 1024 threads/block
Checking computed result for correctness: Result = PASS

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

hi AastaLLL,

I also tested the command line remote gdb. Although very slow, it works.

on remote target:

/usr/local/cuda/bin/cuda-gdbserver :8080 test
Process test created; pid = 3565
Listening on port 8080

on host machine:

xliu@ubuntu:~/work/cuda/test/Debug$ cuda-gdb test 
NVIDIA (R) CUDA Debugger
8.0 release
Portions Copyright (C) 2007-2016 NVIDIA Corporation
GNU gdb (GDB) 7.6.2
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu".
For bug reporting instructions, please see:
Reading symbols from /home/xliu/work/cuda/test/Debug/test...done.
(cuda-gdb) target remote
Remote debugging using

warning: Unable to find dynamic linker breakpoint function.
GDB will be unable to debug shared library initializers
and track explicitly loaded dynamic code.
0x0000007fb7fd2d80 in ?? ()
(cuda-gdb) b
Breakpoint 1 at 0x404654: file ../src/, line 364.
(cuda-gdb) c
warning: Could not load shared library symbols for 8 libraries, e.g. /lib/aarch64-linux-gnu/
Use the "info sharedlibrary" command to see the complete listing.
Do you need "set solib-search-path" or "set sysroot"?

Breakpoint 1, main (argc=1, argv=0x7fffffef78) at ../src/
364	    if (checkCmdLineFlag(argc, (const char **)argv, "help") ||
(cuda-gdb) p argc
$1 = 1
(cuda-gdb) next

376	    int devID = 0;
378	    if (checkCmdLineFlag(argc, (const char **)argv, "device"))
(cuda-gdb) p devID
$2 = 0


From your comment, you can successfully run application sometimes?

hi AastaLLL,

Using Nsight Eclipse Edition, I can ALWAYS run remote application, as shown in post #5. But can NEVER debug a remote appliction, as stated in post #1.

Using command line cuda-gdb on host and cuda-gdbserver on TX2, I can debug, as shown in post #6.

The goal is to use Nsight to do remote debugging.


Thanks for the clarification and also sorry for my previous misunderstanding.
We are discussing this internally. Will update to you later.

By the way, could you also check this topic?

hi AastaLLL,

I’ve read the topic. I never saw permission related logs, and I can run the app remotely, so I don’t think it’s the same issue.

How’s going with your internal discussion?

hi AastaLLL,

I’ve read the topic. I never saw permission related logs, and I can run the app remotely, so I don’t think it’s the same issue.

How’s going with your internal discussion?


Thanks for your feedback.
We still check this issue. Please wait for our update.



We need more information to find out the root cause. Please helps to provide:
You can attach files from the attachment button at the upper right corner of a posted comment.

1. cuda-gdb traces:
Go to Nsight EE console view during debug session hang -> Click on TV like icon drop down-> Select gdb traces option-> copy the traces from console view.
2. Screen shot of Nsight EE debug perspective
3] Nsight EE log: $workspace/.metadata/.log

Here’s the info. Please let me know if you need more

workspace-log.txt (10.9 KB)
gdb-trace.txt (32.3 KB)
remote-shell.txt (27.4 KB)


I will update to you if we have more information or need more logs.


From the attached cuda-gdb traces, we see that the breakpoint is set on the main function but the breakpoint was never hit. Hence the program keeps running in Nsight.
We keep tracking this.

How do you connect to the device (wifi or ethernet)?
Please try connecting to Ethernet(if not already) and also try waiting for a longer time.



This time I wait longer, and it finally hit the breakpoint, after at least 30min. attached the gdb trace.

What takes it so long? I was using Ethernet.

Both my pc and TX2 board is in the same LAN, connected to a wifi router’s LAN ports.
gdb-trace-final.txt (39.7 KB)

It seems most of the time is used to load symbols from shared libraries. Is it possible that I disable these symbols loading?

I think this is essential.
I will check if it is possible to bypass symbols loading.

Thanks for the feedback.