How do I use nv-nsight-cu-cli and the GUI version for profiling?

I was trying to visualize the struct of a graph. The code I’m using is:
https://github.com/NVIDIA/cuda-samples/tree/master/Samples/simpleCudaGraphs
However, whenever I run the command nv-nsight-cu-cli ./simpleCudaGraphs it just says “==PROF== No kernels were profiled”. When I use nvprof it will yield output for kernel execution time. How do I use nsight compute to profile the code? When I use GUI it will throw error “Connection error detected communicating with target application” when I just let it run. The profile option in the GUI is also confusing to use. I can’t select a file to open because it doesn’t exist or it will automatically add the file file extension at the end. I find the official documentation completely useless to solve this problem. How do I use the nsight compute properly? Thanks!

System: Ubuntu 18.04.2 LTS
GPU: 4xTesla P100
nvcc version: 10.1

Did you run nv-nsight-cu-cli with root permission? Adding sudo resolved my “No kernels were profiled” situation.

Yes. But when I do sudo /usr/local/cuda-10.1/NsightCompute-2019.1/nv-nsight-cu-cli ./app it throws another error saying

==ERROR== The application returned an error code (11)
==WARNING== No kernels were profiled

This only happens when I ssh into my Telsa P100 but runs without problem when I test it on my GTX1080 locally. Any idea why?

It seems that your program were aborted. (return code 11 perhaps segmentation fault)
Does it run normally without using profiler on P100?