How to profile all CUDA activity on a system

ancah · October 21, 2022, 3:06pm

Hello!

I have a complex setup with multiple processes using the GPU on a single machine. How can I get a CUDA trace for all of them using Nsight Systems?

I checked Nsight Systems GUI, but I see no options for this. If I just run it with one of my processes, I do see the other ones in the timeline, but no CUDA trace for them.

I also tried Nsight System CLI with nsys start and nsys stop, but the output file does not contain any CUDA trace.

Any suggestions?
Thanks.

hwilper · October 21, 2022, 3:23pm

The easy way to get multiple processes under one Nsys run is to either set process-tree (if they are all launched from the same base process) or profile a script that launches all of them.

What was the CLI command that you used?

ancah · October 28, 2022, 7:56am

Thanks for your reply.

1. The CLI commands that I used are:
nsys start
nsys stop

The processes that are using CUDA are active in between the above commands. However, the output nsys file doesn’t contain any CUDA trace. Am I missing something or this doesn’t work in Windows?

2. Creating a script that launches all the process is an option, but very complicated to do in our case, that’s why I was looking for alternatives. Assuming we do it, do you know if the process-tree is available only from the CLI or also from the GUI? I searched for it in the GUI, but couldn’t find it.

Thanks!

hwilper · October 28, 2022, 5:00pm

Those commands won’t get you what you want. See User Guide :: Nsight Systems Documentation (that’s a direct link to the cli command switches)

But in general, you should either use the “fire and forget” command with “nsys profile” or you will want to use the interactive commands start, launch, and stop. Unless you use all three, either the application or the profiler does not start.

If you are just trying to control what part of the application is traced, I would recommend using a delay command or a duration command and the “nsys profile” command to run your session.

ancah · November 1, 2022, 3:51pm

Thanks for your reply. I did some more tests and it’s clear now that all 3 interactive commands are needed (start, launch, stop). My initial understanding of the documentation was wrong.

So, I guess our only option for profiling all the processes is to create a script that launches all of them.

hwilper · November 1, 2022, 7:30pm

Yes, nsys launches one process, putting them in a script makes them one process tree.

system · November 15, 2022, 7:30pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
profiling CPU and CPU of multiple real-time tasks Nsight Eclipse Edition	2	2982	April 26, 2017
How to profile several processes by NCU at the same time? Nsight Compute	4	757	March 8, 2023
Not getting NVTX events from child processes on Linux Profiling Linux Targets	8	1227	June 23, 2021
NSight Systems does not profile subprocess(via fork in unistd or Process in python.multiprocess) CUDA_API Profiling Linux Targets	6	1297	September 23, 2024
nsys CUDA trace works for threads, but not for subprocesses Profiling Linux Targets	3	2352	May 13, 2019
If nsys has an option similar to ‘–profile-all-processes’?(Not getting cuda information from child processes on Linux Profiling Linux Targets nsight	8	1934	July 12, 2024
Nsight Systems Missing CUDA Info in Multi-Process Profiling Profiling Linux Targets cuda , nsight	2	2237	March 30, 2023
No CUDA events collected Profiling x86 Windows Targets	11	850	April 5, 2024
How to get full profiling with Nsight system for a particular process Profiling Linux Targets cudnn	8	1488	September 23, 2024
Nsys for multi GPU apps Profiling Linux Targets	1	1355	September 10, 2018

How to profile all CUDA activity on a system

Related topics