I’ve got a DGX-A100 system at my university. I’ve got a docker container in it. I’m learning CUDA this way.
My host machine is MacOS. So far I’ve managed to remote profile a CUDA program with Nsight Systems.
And then I’m trying to profile it with Nsight Compute selecting a kernel from the Nsight Systems timeline and choosing Analyze with Nsight Compute.
And then Nsight Compute fails to profile. Here’s what it’s saying in the Progress log:
==PROF== Attempting to connect to ncu-ui at 127.0.0.1:50152…
==PROF== Attempting to connect to ncu-ui at 172.20.10.2:50152…
==PROF== Attempting to connect to ncu-ui at 10.211.55.2:50152…
==PROF== Attempting to connect to ncu-ui at 10.37.129.2:50152…
And none of these attempts succeed.In the end I get a window that remote profile failed.
Here’s a screenshot of the Connection dialog in which I choose Profile:
The ncu CLI on the target is not able to connect back to the ncu-ui UI process on your host system on the specified IP addresses and port ranges. You should check that your host system is reachable from the docker instance using these connection parameters.
Note that you can also use the ncu CLI directly on the target to profile your application and create a report file that can then be opened later in your host UI, or in the CLI itself. You can also use the UI’s Interactive Profile activity to interactively connect and step through your target application, selecting kernels to profile manually.
Please also let us know the exact Nsight Compute version you are using.
Note that you
have to use the same version on the target and host
can (and should) use the latest version, even when profiling applications with older CUDA toolkit or driver versions. I recommend to use version 2021.3, which has many improvements for remote profiling to better support SSH configurations.
Hello, I am experiencing the same problem.
The macOS host uses Nsight compute 2022.3.0 and the target linux server (headless Ubuntu Server 20.04) uses CUDA v11.8. I’ve checked that ufw is not enabled and I’m not using docker.
EDIT: Interactive profile works, but profile is broken
Interactive and non-interactive profile activities work slightly differently, as I described in an earlier comment. For the (non-interactive) Profile activity, a connection from the target system to the local MacOS host must be established. Reasons why this may not work are:
The local MacOS host has a firewall enabled.
The local MacOS machine isn’t reachable from the remote machine (SSH alias to non-resolving DNS name, NAT issue, etc.)
In any case, we are working on improvements to how this activity connects which should mitigate issues in almost all cases, as long as the initial connection to the target system can be established. This is scheduled to be part of the next feature release.