Nsight-system can't recognize the conda enviroment when profile the application

tianshu@node31-a100:~/fusion$ /usr/local/cuda-11.4/nsight-systems-2022.5/bin/nsys status -e
Timestamp counter supported: Yes

CPU Profiling Environment Check
Root privilege: disabled
Linux Kernel Paranoid Level = 1
Linux Distribution = Ubuntu
Linux Kernel Version = 5.4.0-107-generic: OK
Linux perf_event_open syscall available: OK
Sampling trigger event available: OK
Intel(c) Last Branch Record support: Not Available
CPU Profiling Environment (process-tree): OK
CPU Profiling Environment (system-wide): Fail
tianshu@node31-a100:~/fusion$ /usr/local/cuda-11.4/nsight-systems-2022.5/bin/nsys --version
NVIDIA Nsight Systems version 2022.5.1.82-32078057v0
a100:~/fusion$ nvidia-smi
Tue Feb 28 13:44:58 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.103.01   Driver Version: 470.103.01   CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA A100-SXM...  On   | 00000000:07:00.0 Off |                    0 |
| N/A   28C    P0    18W / 400W |   5475MiB / 39538MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+
|   1  NVIDIA A100-SXM...  On   | 00000000:0A:00.0 Off |                    0 |
| N/A   26C    P0    52W / 400W |      3MiB / 39538MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+
|   2  NVIDIA A100-SXM...  On   | 00000000:47:00.0 Off |                    0 |
| N/A   26C    P0    54W / 400W |      3MiB / 39538MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+
|   3  NVIDIA A100-SXM...  On   | 00000000:4D:00.0 Off |                    0 |
| N/A   28C    P0    54W / 400W |      3MiB / 39538MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+
|   4  NVIDIA A100-SXM...  On   | 00000000:87:00.0 Off |                    0 |
| N/A   28C    P0    54W / 400W |      3MiB / 39538MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+
|   5  NVIDIA A100-SXM...  On   | 00000000:8D:00.0 Off |                    0 |
| N/A   27C    P0    53W / 400W |      3MiB / 39538MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+
|   6  NVIDIA A100-SXM...  On   | 00000000:C7:00.0 Off |                    0 |
| N/A   26C    P0    56W / 400W |      3MiB / 39538MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+
|   7  NVIDIA A100-SXM...  On   | 00000000:CA:00.0 Off |                    0 |
| N/A   28C    P0    59W / 400W |   2058MiB / 39538MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+

and when I solely run the application, it works well.

(base) tianshu@node31-a100:~/fusion/newGCNFusion_copy_2$ conda activate work2
(work2) tianshu@node31-a100:~/fusion/newGCNFusion_copy_2$ python gcn_test.py
cuda:0
TC_Blocks:      23772
Exp_Edges:      6085632
Prep. (ms):     122.526

but “ModuleError” occered when profile with the nsys, so what’s the problem with this?

(work2) tianshu@node31-a100:~/fusion/newGCNFusion_copy_2$ sudo /usr/local/cuda-11.4/nsight-systems-2022.5/bin/nsys profile python gcn_test.py
  File "gcn_test.py", line 57
SyntaxError: Non-ASCII character '\xe7' in file gcn_test.py on line 57, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details
Generating '/tmp/nsys-report-efba.qdstrm'
[1/1] [========================100%] report2.nsys-rep
Generated:
    /home/tianshu/fusion/newGCNFusion_copy_2/report2.nsys-rep

and I replace “python” with “python3”, above error dissapear and new error occered, and that’s the problem puzzled me.

(work2) tianshu@node31-a100:~/fusion/newGCNFusion_copy_2$ sudo /usr/local/cuda-11.4/nsight-systems-2022.5/bin/nsys profile python3 gcn_test.py
Traceback (most recent call last):
  File "gcn_test.py", line 1, in <module>
    import dgl
ModuleNotFoundError: No module named 'dgl'
Generating '/tmp/nsys-report-c23c.qdstrm'
[1/1] [========================100%] report3.nsys-rep
Generated:
    /home/tianshu/zjlab/fusion/newGCNFusion_copy_2/report3.nsys-rep

@tcourtney please take a look at this.

@lylyly6666 I suspect that the issue is related to “sudo”, because sudo will create a new shell (owned by the root user), which may use a different python environment and may not have your conda environment activated.

To see if this is an issue, you could try a few things:

  1. in your normal shell with conda active, run “which python”
  2. then run “sudo which python” and see what it reports.
  3. then run “sudo bash” and then activate your conda environment, then run “which python”.

I believe that 3) should give you a root shell with the python environment that you expect. If so, then I hope you could run nsys with your application and get the same result as when you run your application directly.

If you try these steps and you still can’t get nsys to work, please send me the output of the commands on each step and I can suggest what to do next.

after run the 3) commands I finally can run the application with the nsight-system, Thank you most sincerely.

[7/8] Executing 'gpumemtimesum' stats report

 Time (%)  Total Time (ns)  Count  Avg (ns)   Med (ns)  Min (ns)  Max (ns)   StdDev (ns)      Operation     
 --------  ---------------  -----  ---------  --------  --------  ---------  -----------  ------------------
     88.2        9,512,799  3,809    2,497.5   2,463.0     2,336      4,033        105.4  [CUDA memcpy DtoH]
     11.4        1,228,877     11  111,716.1   2,497.0     2,208  1,186,536    356,487.1  [CUDA memcpy HtoD]
      0.3           29,027      8    3,628.4   3,744.5     2,304      4,800      1,210.3  [CUDA memset]     
      0.2           19,556      4    4,889.0   4,833.0     4,833      5,057        112.0  [CUDA memcpy DtoD]

[8/8] Executing 'gpumemsizesum' stats report

 Total (MB)  Count  Avg (MB)  Med (MB)  Min (MB)  Max (MB)  StdDev (MB)      Operation     
 ----------  -----  --------  --------  --------  --------  -----------  ------------------
     15.750     11     1.432     0.003     0.000    15.522        4.673  [CUDA memcpy HtoD]
      0.212      4     0.053     0.053     0.053     0.053        0.000  [CUDA memcpy DtoD]
      0.029      8     0.004     0.005     0.000     0.005        0.002  [CUDA memset]     
      0.017  3,809     0.000     0.000     0.000     0.000        0.000  [CUDA memcpy DtoH]

Generated:
    /home/tianshu/fusion/report3.nsys-rep
    /home/tianshu/fusion/report3.sqlite

Thanks again!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.