Fail to run "Hello World" on Jetson TX2 as remote from Nsight Eclipse Edition - CUDA 10.0

Hello everyone,

I would like to develop an application in Nsight Eclipse Edition comes with CUDA 10.0 toolkit. I am using Jetson TX2 and I have setup my system (Jetson + Host PC) by Jetpack4.2 and SDK manager

I have sucessfully flash the OS image and CUDA, TensorRT etc. to Jetson TX2 and CUDA toolkit also installed in my host computer. My host runs on Ubuntu 18.04.

I have followed the instruction in My host has CUDA 10.0 for cross-platform, here is the terminal output:

Reading package lists... Done
Building dependency tree       
Reading state information... Done
cuda-cross-aarch64 is already the newest version (10.0.166-1).
The following package was automatically installed and is no longer required:
Use 'sudo apt autoremove' to remove it.
0 upgraded, 0 newly installed, 0 to remove and 202 not upgraded.

I have also set my environment based on this by changing cuda10.1 to cuda10.0 and NsightCompute-2019.1 to NsightCompute-1.0:

In Nsight, I modify the project build properties by adding /usr/local/cuda-10.0/targets/aarch64-linux/lib to Build->Settings->NVCC Linker->Libraries->Library search path(-L). After I click to “Apply”, "Libraries (-l) is still empty. Is it normal?

Additionally, I set the CPU Architecture of Jetson TX2 (remote) to AArch64.

I am able to build my project successfully, here is the console output:

11:16:08 **** Build of configuration Debug for project HelloCuda ****
make all -C /home/ktnvidia/CUDA_Projects/HelloCuda/Debug 
make: Entering directory '/home/ktnvidia/CUDA_Projects/HelloCuda/Debug'
Building file: ../
Invoking: NVCC Compiler
/usr/local/cuda-10.0/bin/nvcc -G -g -O0 -ccbin aarch64-linux-gnu-g++ -gencode arch=compute_35,code=sm_35 -gencode arch=compute_60,code=sm_60 -m64 -odir "." -M -o "helloworld.d" "../"
/usr/local/cuda-10.0/bin/nvcc -G -g -O0 --compile --relocatable-device-code=false -gencode arch=compute_35,code=compute_35 -gencode arch=compute_60,code=compute_60 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_60,code=sm_60 -m64 -ccbin aarch64-linux-gnu-g++  -x cu -o  "helloworld.o" "../"
Finished building: ../
Building target: HelloCuda
Invoking: NVCC Linker
/usr/local/cuda-10.0/bin/nvcc --cudart static -L/usr/local/cuda-10.0/targets/aarch64-linux/lib --relocatable-device-code=false -gencode arch=compute_35,code=compute_35 -gencode arch=compute_60,code=compute_60 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_60,code=sm_60 -m64 -ccbin aarch64-linux-gnu-g++ -link -o  "HelloCuda"  ./helloworld.o   
Finished building target: HelloCuda
make: Leaving directory '/home/ktnvidia/CUDA_Projects/HelloCuda/Debug'
> Shell Completed (exit code = 0)

11:16:12 Build Finished (took 4s.276ms)

However, when I run my code:

#include "stdio.h"

__global__ void helloFromGPU(void)
   printf("Hello from GPU");

int main()
   printf("Hello from CPU");
   helloFromGPU <<<1, 5>>>();
   return 0;

I see this result:

o $PWD'>'
/bin/sh -c "cd \"/home/ktnvidia/CUDA_Projects/HelloCuda/Debug\";export LD_LIBRARY_PATH=\"/usr/local/cuda-10.0/lib64\":\${LD_LIBRARY_PATH};export NVPROF_TMPDIR=\"/tmp\";\"/home/ktnvidia/CUDA_Projects/HelloCuda/Debug/HelloCuda\"";exit
ktnvidia@ktnvidia-desktop:~$ echo $PWD'>'
ktnvidia@ktnvidia-desktop:~$ /bin/sh -c "cd \"/home/ktnvidia/CUDA_Projects/HelloCCuda/Debug\";export LD_LIBRARY_PATH=\"/usr/local/cuda-10.0/lib64\":\${LD_LIBRARY__PATH};export NVPROF_TMPDIR=\"/tmp\";\"/home/ktnvidia/CUDA_Projects/HelloCuda/Debbug/HelloCuda\"";exit
Hello from CPUlogout

It seems I am not able to run my device code on Jetson TX2. Do you have any idea how can I run that simple code on Jetson TX2?

Thank you very much for your help


You will need the root authority for profiling tool.
Have you login in with root account.

If not, here is some information for your reference: