Hi, I’m trying to use nsight system on docker but I cannot get the CPU statistics.
I followed the guide to enable CPU sampling on docker by setting the paranoid level to 2 and giving the seccomp custom configuration file but it still doesn’t work.
From the nsight environment check I see that the “linux kernel paranoid Level” is set to -1.
My environment is:
Host
Ubuntu 22.04 LTS
Docker 20.10.16
Nvidia driver 510.73.05
Container:
Ubuntu 20.04 LTS
CUDA 11.4 (installed with .run file)
Nsight System 2021.2.4.12-a25c8fd
Nsight environment query on container:
nsys status -e
Timestamp counter supported: Yes
Sampling Environment Check
Linux Kernel Paranoid Level = -1: OK
Linux Distribution = Ubuntu
Linux Kernel Version = 5.15.0-35-generic: OK
Linux perf_event_open syscall available: OK
Sampling trigger event available: OK
Intel(c) Last Branch Record support: Available
Sampling Environment: OK
Nvidia-smi on host
nvidia-smi
Tue Jun 7 14:01:20 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.73.05 Driver Version: 510.73.05 CUDA Version: 11.6 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Quadro RTX 4000 Off | 00000000:01:00.0 On | N/A |
| 30% 43C P8 17W / 125W | 546MiB / 8192MiB | 5% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 2566 G /usr/lib/xorg/Xorg 146MiB |
| 0 N/A N/A 2701 G ...ome-remote-desktop-daemon 2MiB |
| 0 N/A N/A 2737 G /usr/bin/gnome-shell 188MiB |
| 0 N/A N/A 5236 G ...6/usr/lib/firefox/firefox 157MiB |
| 0 N/A N/A 18587 G ...AAAAAAAAA= --shared-files 18MiB |
| 0 N/A N/A 24574 G ...ost-linux-x64/nsys-ui.bin 27MiB |
+-----------------------------------------------------------------------------+
You can see that all the information I get are some poll and ioctl.
To launch nsys I just use:
nsys profile <executable>
I would like to get an output similar to this: report1.nsys-rep (1.8 MB). This comes from a simple vector addition application profiling on a Windows native machine.
Nope, still not getting anything.
Also, I just tried on the local linux machine (the host machine) with cuda 11.7 and even there I don’t get any CPU statistics.
nsys status -e on host
Timestamp counter supported: Yes
Sampling Environment Check
Root privilege: disabled
Linux Kernel Paranoid Level = 2: OK
Linux Distribution = Ubuntu
Linux Kernel Version = 5.15.0-37-generic: OK
Linux perf_event_open syscall available: OK
Sampling trigger event available: OK
Intel(c) Last Branch Record support: Available
Sampling Environment: OK
Can you try setting the paranoid level to a lower value, e.g., 1 or -1?
Also, I forgot to mention previously that system-wide sampling requires root privileges. Can you try adding sudo?
You can also check the Diagnostics summary section for warnings and errors, that can give clues to why the samples are not collected.
I believe the nsys crash you experienced has been fixed in a newer build of nsys. Can you upgrade to the latest version of nsys?
Also, looking at your screenshot, it looks like CPU sampling data was collected. When you ask for ‘CPU Statistics’, are you asking for the CPU sampling summary/histogram results? If so, make sure you are selecting the ‘Bottom-Up View’ in the drop down box below the timeline. See the attached screenshot.
Looking closer at this conversation. Sorry, I didn’t realize report1.nsys-rep was collected on a Windows system. It looks correct.
A paranoid level of 2 should work to profile your application and any processes it launches. You do not need to set the paranoid level any lower.
I also don’t think you need to use systemwide sampling unless you are trying to understand what else is running on the system and using system resources while your application runs.
Can you try launching the docker with the --privileged=true switch, install the latest version of nsys, and run the nsys status --environment command from within the docker? Please post those results here. Then, try your collection again.
Hi, sorry for the late answer.
Just tried to run the container with --privileged=true and installed the latest version of nsys.
The environment sampling looks correct now:
nsys status --environment
Timestamp counter supported: Yes
Sampling Environment Check
Root privilege: enabled
Linux Kernel Paranoid Level = 2: OK
Linux Distribution = Ubuntu
Linux Kernel Version = 5.15.0-37-generic: OK
Linux perf_event_open syscall available: OK
Sampling trigger event available: OK
Intel(c) Last Branch Record support: Available
Sampling Environment: OK
Your report1.nsys-rep file did include CPU Instruction Pointer samples. Check out the attached screenshot of that collection’s diagnostics. The red box is a warning indicating that kernel IP samples can’t be collected - i.e. IP samples of OS execution can’t be collected. This warning is not an error. The green box shows that 542 CPU IP samples were collected.
But, maybe you are asking about something else. What do you mean when you say no “CPU op” info was collected?
The thread state information comes from the OS. Linux and Windows provide different information and nsys utilizes what is available. So, these differences are expected.