Remote application development using NVIDIA® Nsight™ Eclipse Edition

Originally published at:

NVIDIA® Nsight™ Eclipse Edition (NSEE) is a full-featured unified CPU+GPU integrated development environment(IDE) that lets you easily develop CUDA applications for either your local (x86_64) system or a remote (x86_64 or ARM) target system. In my last post on remote development of CUDA applications, I covered NSEE’s cross compilation mode. In this post I will focus…

I followed the above instructions up to trying to run the particles example.

Here is what I get:

Last login: Sun Aug 31 21:22:47 2014 from

echo $PWD'>'

/bin/sh -c "cd \"/home/ubuntu/cuda-wrokspace\";export LD_LIBRARY_PATH=\"/usr/local/cuda-6.0/lib\":\${LD_LIBRARY_PATH};\"/home/ubuntu/cuda-wrokspace/particles\"";exit

ubuntu@tegra-ubuntu:~$ echo $PWD'>'


ubuntu@tegra-ubuntu:~$ /bin/sh -c "cd \"/home/ubuntu/cuda-wrokspace\";export LD_ LIBRARY_PATH=\"/usr/local/cuda-6.0/lib\":\${LD_LIBRARY_PATH};\"/home/ubuntu/cuda -wrokspace/particles\"";exit

/bin/sh: 1: /home/ubuntu/cuda-wrokspace/particles: Permission denied


Do you have any idea what is the source of this problem?

Sorry about the late response. Nsight 6.0 had some flacky connection bug that may cause such file permission update issues on the JetsonTK1. Nsight 6.5 has this bug fixed. Since your target is JetsonTK1, you are doing the right thing by continuing to use 6.0 toolkit. To work around this issue you may want to try updating the file permission manually on the target by enabling execute and write permissions using "chmod 777 particles".

Thanks a lot for the excellent tutorial. In my case the host is a MacOSX and the remote target is a Jetson TK1 and I made it to successfully compile and run some CUDA projects. I would just like to add that it was necessary to change the (remote) compiler's path to /usr/bin/arm-linux-gnueabihf-g++-4.8 (the default was g++ version 4.6 for the ARM architecture, while my target had version 4.8 already installed). To do so, go to "Project name>Properties>Build>Settings>Build stages>Compiler path" and do the same at "Project name>Properties>Build>Settings>NVCC Linker>Miscelaneous>Compiler path" (see the attached figures).

Excellent great to know that you are successful in creating CUDA applications for JetsonTK1 using your MacOSX host system. For Jetson TK1 targets with default g++-4.8, yes your proposed change in NsightEclipse is necessary. One could also create a g++-4.6 symlink on the target as follows: sudo ln -sf `which arm-linux-gnueabihf-g++` /usr/bin/arm-linux-gnueabihf-g+

bug #1566745
MacOSX host Jetson target builds and runs particles fine, however:

Debugging Your Remote Application
set break point then
“Debug As->Remote C/C++ Application”

error message:

Error in final launch sequence
Failed to execute MI command:
-target-select remote
Error message from debugger back end:
cuda-gdb version (6.5.121) is not compatible with cuda-gdbserver version (6.0.116).\nPlease use the same version of cuda-gdb and cuda-gdbserver.

NB “Debug Perspective” dialogue not raised

please note on Jetson target:

/usr/local/cuda-6.0/bin$ ./cuda-gdbserver --version
NVIDIA (R) CUDA gdbserver
6.5 release
/usr/local/cuda-6.0/bin$ ./cuda-gdb -v
NVIDIA (R) CUDA Debugger
6.0 release

seems for Jetson one must install CUDA 6.0 toolkit on MacOSX

add "/usr/local/cuda-6.0/samples/common/inc"
to Properties Settings NVCC compiler Includes

wfm cuda-6.0 on OSX with same on Jetson

not needed for some reason on cuda 6.5 but see my note below

That's right if you are using Jetson TK1 as a target device please continue to use 6.0 toolkit as mentioned in the CUDA TK setup section. CUDA6.5TK will be available in a future JetsonTK1 OS image version Rel21.2. You can check your JetsonTK1 release as follows:

> head -1 /etc/nv_tegra_release

can step through CPU code, but having set breakpoint and clicked resume, Registers Value are all Error: Target not available and Disassembly No debug context

please advise further


OSX host Jetson target

I followed this sample using a 64-bit linux host and the Jetson TK1 remote. It worked just fine, but the remote application only runs at 6-7 frames per second. The one made on the Jetson from the the cuda samples directory runs at 60 fps. I was wonder what caused the slow down and how to fix it?

Great good to know you are running code on your JetsonTK1. On the perf issue note that Jetson TK1 has a Kepler class GPU so make sure you check SM32(3.2) in the "Generate GPU code" option under Project>Build>Settings>CUDA.

I set the GPU code to 3.2 and the PTX code to 3.0 and still didn't see a performance increase.

Make sure you are not running in the debugger and that the "Enable CUDA memcheck" box is unchecked under debug configurations->Debugger tab.

Ok it worked. I just needed to run it from the release build.

Is it possible to have Nsight index also header files on the remote machine? For example, I have project which uses the OpenCV library, which is installed only on the Jetson TK1 system. Locally I don't have an OpenCV installation...

Nope you can index only project files on the host system. In sync project mode if you maintain files on the host then those will get sync'd with the target.

I've read that newer Eclipse versions provide this functionality. AFAIK, Nsight is currently based on Eclipse Juno.
Are there any plans to upgrade to a newer version of Eclipse?

Yeah sometime later 2015 to 4.4 Luna.

I'm using freshly installed JetPack (CUDA 6.5) and trying to remotly debug particles example (Jetson target). I'm able to build and run example, but when in the debug, kernel breakpoint are not hit, it stops once in the main and after I click resume I can see that application starts on target (application window opens) but never stops in the kernel and nothing going on in the application window.

Any suggestions?