Profile python script with pytorch in linux target

he_huanxiang · April 21, 2022, 2:44pm

hi sir.
I am new in using nsight compute. Recently, I want to profile my network performance. I run the network in a docker server. I have gotten the privellige in the docker server, so I could catch some information in the gpu and cpu.
when I profile the python script from ncu windows host. I set the parameter like this

When I launch, I get this error

I don’ t know how to fix it.
And in the application executable line, I select the python.exe in my python environment, is it right?
I will do appreciate it if you could give me some help.
Good day.

felix_dt · April 22, 2022, 6:39am

The problematic part is the “No section files found in search paths”, due to which none of the selected sections, such as LaunchStats, can be found.

Section files are text-based files shipped with Nsight Compute that list the metrics to collect and their representation in the UI. During remote profiling, the UI deploys the command line and support files, including these sections, to the remote machine. It appears that most files are listed as present and up-to-date on the remote system, as indicated by the “Checking file deployment” output.

Which version of Nsight Compute are you using? if you haven’t yet, could you try the latest available, i.e. 2022.1?

In the connection settings (i.e. the part you redacted), please check to which directory the command line and sections files are deployed. Then check on the remote system if the files are truly present and accessible by the selected user. One thing I noticed about the Command Line shown in the Profile activity window is that the --section-folder option is present there, which may explain why it doesn’t work on the remote system where this path wouldn’t be valid.

he_huanxiang · April 22, 2022, 12:29pm

The version of Nsight compute which I use is the newest version

I saw that you mention ’ I noticed about the Command Line shown in the Profile activity window is that the --section-folder option is present there, which may explain why it doesn’t work on the remote system where this path wouldn’t be valid. ’ I try to copy .section file to the folder

and give them permission

but then it report

felix_dt · April 22, 2022, 12:41pm

The directory in question is the one specified in the connection dialog. You should confirm that you have write access there.

he_huanxiang · April 22, 2022, 12:47pm

Is that right?

felix_dt · April 22, 2022, 12:57pm

It does look correct, yes. Maybe it’s worth trying with the default target deployment directory, to see if that makes a difference in your case.

he_huanxiang · April 22, 2022, 2:56pm

I tried, but it didn’ t work too.

And it say the folder isn’t a writable folder, so I give the perssion like this

I need a tool to profile my network performance on the gpu, such as gpu L2 cache miss.
So any other suggestion could you give?

felix_dt · April 22, 2022, 3:12pm

It is fine if it can’t deploy files to your local documents directory, as it would fallback to the default sections dir. That’s why it is only a warning. For some reason, this fallback fails. It would be useful for you to check if the sections were properly deployed to /tmp/var/sections.

he_huanxiang · April 22, 2022, 3:50pm

When I launch, it would create /target/linux-desktop-glibc_2_11_3-x64, and it has these files:

it doesn’t has *.section file and section folder

And I has another question, because I am using a docker server, I can’ t see the system screen.In this case,could I download the Linux Desktop file

and using in the command line to profile a python script with pytorch

he_huanxiang · April 25, 2022, 12:21pm

I tried to populate it in advance, but it failed again

felix_dt · April 26, 2022, 1:07pm

It still appears there are permission issues within the docker container to access this directory. It’s not clear from your previous descriptions if you mount the target deployment directory into the container, or if it’s a separate directory from within the container, can you clarify?

My recommendations for you would be the following:

First option is to simply mount your host’s Nsight Compute installation into the container file system and then run the ncu CLI from within the container. Have this generate a report file with -o and then analyze this report file with either the CLI within/outside the container, or map/copy it back to the host OS to open it in the UI.
Understand and solve the directory permission issues. It’s not clear if the permission info you’ve shown earlier if from within the container or from the host. Note that by default, user permissions don’t match, as the UID in the container will differ from your host user ID. You can tell docker to map your host user to your docker user to solve this, please refer to the docker documentation on how to do this. A way to confirm that this likely is the issue could be to start the container and then ssh into it with the same user/credentials you passed to Nsight Compute during the remote connection. I would expect that this user also couldn’t touch (Linux command) any files in the deployment directory, e.g. /tmp/var/sections. Yet another option could be to ssh into the container (with Nsight Compute) as root, even though that would potentially require to enable ssh for root and may not be what you want.

For further info on using Nsight Compute in container, also see:
https://developer.nvidia.com/blog/using-nsight-compute-in-containers/

Topic		Replies	Views
use nsight compute in docker Nsight Compute	7	3262	July 6, 2022
Cannot connect to process in nsight compute Nsight Compute deep-learning-profiler	5	843	April 29, 2024
Nsight-system failed to start profiling Profiling x86 Windows Targets	9	2337	October 12, 2021
Nsight Compute remote connection problem Nsight Compute	2	1640	June 27, 2019
Ncu-ui executable application python issue Nsight Compute python	12	1088	January 23, 2024
In ubuntu server, do I have to use nvidia docker for nsight compute? Nsight Compute	2	591	November 7, 2023
Ncu-ui not profiling some sections Nsight Compute	4	2319	November 26, 2020
Nsight 2021.5.2.53 does not profile streams Profiling Linux Targets	8	998	March 6, 2023
Profiling Python code using sudo Profiling Linux Targets nsight , python , profiling	8	2133	March 10, 2022
==ERROR== Unable to write to file pr.ncu-rep. Please verify this file is not locked, and writable Nsight Compute	12	712	May 7, 2024

Profile python script with pytorch in linux target

Related topics