Nsight system failed to start daemon while profiling a remote Linux server

lengran · November 7, 2022, 12:24pm

I have been stuck at the first step when I tried to profiling something on a Linux server. I just can’t connect to the target server using Nsight Systems GUI. Every time it ran into “Launching the tools app on the target” it failed and say “Failed to launch daemon”. I have checked firewall rules, gcc version, ldd version and I even reinstalled CUDA. Nothing changed. It is quite frustrated. The following describes my setup. If you need more info please let me know. Thanks for any help.

Nsight Systems Version: 2022.4.1.21

Target server:
Hardware: Tesla V100 SXM2
Distribution: Ubuntu 18.04.4 LTS
Kernel: 4.15.0-176-generic
Driver Version: 520.61.05
CUDA Version: 11.8
GCC Version: 7.5.0
ldd version: 2.27

$ cat /proc/sys/kernel/perf_event_paranoid
2
$ apt list --installed | grep linux-headers
linux-headers-4.15.0-163/bionic-updates,bionic-security,now 4.15.0-163.171 all [installed,automatic]
linux-headers-4.15.0-163-generic/bionic-updates,bionic-security,now 4.15.0-163.171 amd64 [installed,automatic]
linux-headers-4.15.0-175/bionic-updates,bionic-security,now 4.15.0-175.184 all [installed,automatic]
linux-headers-4.15.0-175-generic/bionic-updates,bionic-security,now 4.15.0-175.184 amd64 [installed,automatic]
linux-headers-4.15.0-176/bionic-updates,bionic-security,now 4.15.0-176.185 all [installed]
linux-headers-4.15.0-176-generic/bionic-updates,bionic-security,now 4.15.0-176.185 amd64 [installed,automatic]
linux-headers-generic/bionic-updates,bionic-security,now 4.15.0.176.165 amd64 [installed,automatic]
$ nsys status -e
Timestamp counter supported: Yes

CPU Profiling Environment Check
Root privilege: disabled
Linux Kernel Paranoid Level = 2
Linux Distribution = Ubuntu
Linux Kernel Version = 4.15.0-176-generic: OK
Linux perf_event_open syscall available: OK
Sampling trigger event available: OK
Intel(c) Last Branch Record support: Available
CPU Profiling Environment (process-tree): OK
CPU Profiling Environment (system-wide): Fail

hwilper · November 7, 2022, 9:05pm

That is a weird error message.

Can you SSH onto the system and run the CLI directly on the server?

What options are you trying to run?

lengran · November 7, 2022, 11:38pm

Yes, I can ssh to the server. The aforementioned “nsys status -e” is executed on the server via ssh connection.
For now I’m just a beginner trying to learn profiling techniques and don’t really have an idea of a specific program to optimize.

hwilper · November 8, 2022, 2:46pm

Okay, if you are already on the server, you should just run the CLI there, there is no need to remote from the GUI on the host. You will need to pick an application to profile, although if you are just playing, I have used “top” as the application although it doesn’t show much interesting content.

try:

nsys profile top

that is enough to launch top and run the default options until you end the application.

You can see User Guide :: Nsight Systems Documentation (that’s a direct link to the section, even if it doesn’t look like it) to see some example command lines and what they do.

At User Guide :: Nsight Systems Documentation (again a direct link) you will see a list of resources for learning more about Nsys. I especially recommend the first resource, the DLI self-paced learning course.

lengran · November 9, 2022, 8:51am

Thanks for your help. I will read it.
Also I just found the Nsight System GUI version somehow can connect the server now. Does it require a Internet connection to set up a profiling environment? Because I have done nothing in the past day except setting up a tunnel so it can access Internet now. I might have missed this part in the installation manual.

lengran · November 9, 2022, 9:45am

Also I have a question about the “CPU Profiling Environment (system-wide): Fail” returned by “nsys status -e”. What does it means? Would it affect some functions while profiling a program? Is so, how can I fix it?

hwilper · November 9, 2022, 9:08pm

It should not require a internet connection to connect to the server. All I can think of is that connecting the internet opened one of the ports that it does need open for communications from host->target and back. See the requirements section in the documentation for more info.

Your “nsys status -e” command shows that the linux paranoid level for your system is 2. We use the linux perf environment to collect information about the system and at a level 2 paranoia level we can only get information about things happening in your process tree. If you need information about everything on the system, you need to set the paranoid level differently.

But you probably do not need that, as you are most likely profiling your application and not the computer in general. If you do need that the user manual has information on how to reset it. There is also a chart in the user manual in the “CPU profiling using the Linux Perf Subsystem” section that tells you what information is available at each paranoid level.

lengran · November 9, 2022, 11:45pm

That’s wierd. The firewall should allow connections on any ports between the server and my IP as I configured it. Maybe it has nothing to do with the network. But anyway, it’s working now. Huge thanks for your help.

Topic		Replies	Views
Linux: Cannot Start Profile, Cannot Start Daemon Profiling Linux Targets	4	2319	February 3, 2022
Nsight-system failed to start profiling Profiling x86 Windows Targets	9	2426	October 12, 2021
Nsight System profiling error Profiling Linux Targets cuda , kernel , python	3	1154	January 27, 2023
Nsight-system can't recognize the conda enviroment when profile the application Profiling Linux Targets cuda	4	1156	March 2, 2023
Nsight Systems Issue: Unable to configure the collection of CPU IP samples Profiling Linux Targets	12	8976	December 27, 2021
Linux Kernel Paranoid Level = -1: OK Profiling Linux Targets cuda , nsight , docker , containers	13	3074	June 21, 2022
Failed to start daemon Profiling Linux Targets	6	1405	October 12, 2021
Nsys dying with "Agent launcher failed." Profiling Linux Targets	14	1397	March 13, 2023
Nsight Systems:Failed to launch daemon Profiling x86 Windows Targets cuda	6	1069	February 6, 2024
Nsight system fails to connect to daemon Profiling Linux Targets	25	2727	April 12, 2023

Nsight system failed to start daemon while profiling a remote Linux server

Related topics