Linux: Cannot Start Profile, Cannot Start Daemon

I have an ubuntu 20.04 (Focal) computer with 4x RTXA5000 and Intel Core i9 CPU. I have CUDA 11.4 with drivers 470.86. I’ve recently installed Nsight Systems 2021.5.1 and am struggling to get it started.

When I run a profile from the command line, there is a pause and then:

$ nsys profile python3 streamtest.py
/build/agent/work/20a3cfcd1c25021d/QuadD/Common/AgentAPI/Src/SessionImpl.cpp(18): rpc Start(.Agent.EmptyMessage) returns (.Agent.EmptyMessage);
 is canceled because the timeout period is expired

When I start nsys-ui it launches the GUI fine. I am asked to select a target for profiling. Under Localhost I see my hostname but when I try to connect I simply get “Failed to launch daemon”. If I copy to clipboard I can see:

NVIDIA Nsight Systems
2021.5.1.118-f89f9cd Linux

- host:

[Error] Failed to launch daemon
Failed to launch daemon executable on target.<br>Please kill all running instances of nsys manually or reboot the target.

If instead I set up an SSH connection I get a similar failure, but I am able to get the following error message:

DaemonStartError (1405) {
    OriginalExceptionClass: N5boost16exception_detail10clone_implIN13QuadDAnalysis16DaemonStartErrorEEE
    OriginalFile: /build/agent/work/20a3cfcd1c25021d/QuadD/Host/Analysis/SshDevice.cpp
    OriginalLine: 1493
    OriginalFunction: virtual std::string QuadDAnalysis::SshDevice::StartDaemon(const string&)
    ErrorText: Daemon start failed.
    ExitCode: 1
}

I should note I’m trying to profile things because I seem to have a problem where all cuda kernels take ~30+ seconds before they do anything.

Any help is appreciated. Thank you!
Luke

It looks to me like you may have a zombie Nsys process on the system.

@liuyis, do you have another suggestion?

hwilper’s suspect makes sense. Could you try killing all “nsys” processes and/or rebooting the system?

If the issue still happens after rebooting, please help collecting logs to help us investigate:

  1. Save the following content to nvlog.config file:
+ 75iewf 75IWEF global

$ /tmp/nsight-sys.log

ForceFlush

Format $sevc$time|${name:0}|${tid:5}|${file:0}:${line:0}[${sfunc:0}]: $text
  1. Run the following command: NVLOG_CONFIG_FILE=<path to nvlog.config> nsys profile python3 streamtest.py
  2. There should be a log file under the path /tmp/nsight-sys.log. Share it to us.

Thanks!

Thank you. It is possible it was the zombie nsys process. After rebooting the system, the problem has been resolved.

Thanks for the help.
Luke

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.