Nsight system failed to start daemon while profiling a remote Linux server

I have been stuck at the first step when I tried to profiling something on a Linux server. I just can’t connect to the target server using Nsight Systems GUI. Every time it ran into “Launching the tools app on the target” it failed and say “Failed to launch daemon”. I have checked firewall rules, gcc version, ldd version and I even reinstalled CUDA. Nothing changed. It is quite frustrated. The following describes my setup. If you need more info please let me know. Thanks for any help.

Nsight Systems Version: 2022.4.1.21

Target server:
Hardware: Tesla V100 SXM2
Distribution: Ubuntu 18.04.4 LTS
Kernel: 4.15.0-176-generic
Driver Version: 520.61.05
CUDA Version: 11.8
GCC Version: 7.5.0
ldd version: 2.27

$ cat /proc/sys/kernel/perf_event_paranoid
$ apt list --installed | grep linux-headers
linux-headers-4.15.0-163/bionic-updates,bionic-security,now 4.15.0-163.171 all [installed,automatic]
linux-headers-4.15.0-163-generic/bionic-updates,bionic-security,now 4.15.0-163.171 amd64 [installed,automatic]
linux-headers-4.15.0-175/bionic-updates,bionic-security,now 4.15.0-175.184 all [installed,automatic]
linux-headers-4.15.0-175-generic/bionic-updates,bionic-security,now 4.15.0-175.184 amd64 [installed,automatic]
linux-headers-4.15.0-176/bionic-updates,bionic-security,now 4.15.0-176.185 all [installed]
linux-headers-4.15.0-176-generic/bionic-updates,bionic-security,now 4.15.0-176.185 amd64 [installed,automatic]
linux-headers-generic/bionic-updates,bionic-security,now amd64 [installed,automatic]
$ nsys status -e
Timestamp counter supported: Yes

CPU Profiling Environment Check
Root privilege: disabled
Linux Kernel Paranoid Level = 2
Linux Distribution = Ubuntu
Linux Kernel Version = 4.15.0-176-generic: OK
Linux perf_event_open syscall available: OK
Sampling trigger event available: OK
Intel(c) Last Branch Record support: Available
CPU Profiling Environment (process-tree): OK
CPU Profiling Environment (system-wide): Fail

That is a weird error message.

Can you SSH onto the system and run the CLI directly on the server?

What options are you trying to run?

Yes, I can ssh to the server. The aforementioned “nsys status -e” is executed on the server via ssh connection.
For now I’m just a beginner trying to learn profiling techniques and don’t really have an idea of a specific program to optimize.

Okay, if you are already on the server, you should just run the CLI there, there is no need to remote from the GUI on the host. You will need to pick an application to profile, although if you are just playing, I have used “top” as the application although it doesn’t show much interesting content.


nsys profile top

that is enough to launch top and run the default options until you end the application.

You can see User Guide :: Nsight Systems Documentation (that’s a direct link to the section, even if it doesn’t look like it) to see some example command lines and what they do.

At User Guide :: Nsight Systems Documentation (again a direct link) you will see a list of resources for learning more about Nsys. I especially recommend the first resource, the DLI self-paced learning course.

1 Like

Thanks for your help. I will read it.
Also I just found the Nsight System GUI version somehow can connect the server now. Does it require a Internet connection to set up a profiling environment? Because I have done nothing in the past day except setting up a tunnel so it can access Internet now. I might have missed this part in the installation manual.

Also I have a question about the “CPU Profiling Environment (system-wide): Fail” returned by “nsys status -e”. What does it means? Would it affect some functions while profiling a program? Is so, how can I fix it?

It should not require a internet connection to connect to the server. All I can think of is that connecting the internet opened one of the ports that it does need open for communications from host->target and back. See the requirements section in the documentation for more info.

Your “nsys status -e” command shows that the linux paranoid level for your system is 2. We use the linux perf environment to collect information about the system and at a level 2 paranoia level we can only get information about things happening in your process tree. If you need information about everything on the system, you need to set the paranoid level differently.

But you probably do not need that, as you are most likely profiling your application and not the computer in general. If you do need that the user manual has information on how to reset it. There is also a chart in the user manual in the “CPU profiling using the Linux Perf Subsystem” section that tells you what information is available at each paranoid level.

1 Like

That’s wierd. The firewall should allow connections on any ports between the server and my IP as I configured it. Maybe it has nothing to do with the network. But anyway, it’s working now. Huge thanks for your help.