nsys version: NVIDIA Nsight Systems version 2022.2.1.31-5fe97ab
command: nsys profile --delay=5 python test.py
error: Agent launcher failed.
You are a couple of versions behind, but I don’t know if that is likely to help, I can’t tell from the above what is likely to be going on.
Can you try the following?
- Please try profiling with
-t none -s none --cpuctxsw=none
to see if the workload runs successfully. - If 1 works, then try
-t cuda -s none --cpuctxsw=none
to see if the workload runs successfully - If 2 works, then try just
-t cuda
CLI options.
I see this issue (nsys got into this state somehow) with a recent version (output below). None of the workarounds upthread work for me. There are no runaway or stale nsys processes on the system. Rebooting isn’t an option for me (the system has been powered on for several months).
$ nsys --version
NVIDIA Nsight Systems version 2022.4.2.1-df9881f
nsys profile with any command
$ nsys profile -t cuda --cpuctxsw=none -s none <any command>
Agent launcher failed.
$ ps -e | grep nsys
$
I found the issue for my situation. The /
partition was full. nsys
is clearly missing an error check for not being able to create/write to a file and emitting a proper error message. It’s likely other users who faced this issue had their root partition full due to /tmp/ becoming larger and then subsequently being cleared on a reboot.
Specifically, what is the path of the file that it couldn’t write?A
And is / or /tmp the problem?