How to profile all CUDA activity on a system


I have a complex setup with multiple processes using the GPU on a single machine. How can I get a CUDA trace for all of them using Nsight Systems?

I checked Nsight Systems GUI, but I see no options for this. If I just run it with one of my processes, I do see the other ones in the timeline, but no CUDA trace for them.

I also tried Nsight System CLI with nsys start and nsys stop, but the output file does not contain any CUDA trace.

Any suggestions?

The easy way to get multiple processes under one Nsys run is to either set process-tree (if they are all launched from the same base process) or profile a script that launches all of them.

What was the CLI command that you used?

Thanks for your reply.

1. The CLI commands that I used are:
nsys start
nsys stop

The processes that are using CUDA are active in between the above commands. However, the output nsys file doesn’t contain any CUDA trace. Am I missing something or this doesn’t work in Windows?

2. Creating a script that launches all the process is an option, but very complicated to do in our case, that’s why I was looking for alternatives. Assuming we do it, do you know if the process-tree is available only from the CLI or also from the GUI? I searched for it in the GUI, but couldn’t find it.


Those commands won’t get you what you want. See User Guide :: Nsight Systems Documentation (that’s a direct link to the cli command switches)

But in general, you should either use the “fire and forget” command with “nsys profile” or you will want to use the interactive commands start, launch, and stop. Unless you use all three, either the application or the profiler does not start.

If you are just trying to control what part of the application is traced, I would recommend using a delay command or a duration command and the “nsys profile” command to run your session.

1 Like

Thanks for your reply. I did some more tests and it’s clear now that all 3 interactive commands are needed (start, launch, stop). My initial understanding of the documentation was wrong.

So, I guess our only option for profiling all the processes is to create a script that launches all of them.

Yes, nsys launches one process, putting them in a script makes them one process tree.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.