Nsight System Profiling two CUDA python(i.e. Pytorch) processes using the same GPU simultaneously

lvcunchi · March 16, 2024, 7:18am

As the title mentioned, when I use nsys to profile two pytorch processes simultaneously within an A100, I find that Graph executing of each process is completely interlaced like the following pic. Can you tell me the reasons? Thank a lot.

hwilper · March 22, 2024, 3:45pm

@liuyis cna you respond to this?

liuyis · March 22, 2024, 4:12pm

@lvcunchi Could you share your report file?

I’d also suggest trying a few more profiling options to get more insights:

Try adding the option --cuda-graph-trace=node to see the node-level details of each graph execution.
Try --gpu-metrics-devices=all option to enable GPU metrics sampling feature and observe the GPU metrics during these graph execution.
Try --gpuctxsw=true option to enable the GPU context switch trace - there were probably GPU context switcing happening between the two processes.

Topic		Replies	Views
Nsys profile failed when using pytorch cudagraph Profiling Linux Targets pytorch	4	563	June 26, 2024
Execute multi GPU with nsys profile command but GPU may be locked Profiling x86 Windows Targets	6	1085	June 17, 2024
Profiling fails on more than one gpu device Nsight Compute	9	1186	November 15, 2023
Multi Node Profiling with Nsight Systems Profiling Linux Targets	7	1441	July 8, 2024
If nsys has an option similar to ‘–profile-all-processes’?(Not getting cuda information from child processes on Linux Profiling Linux Targets nsight	8	2120	July 12, 2024
Question when Prifilling Megatron-LM Profiling Linux Targets cudnn , llama	8	94	November 14, 2025
NSight Systems does not profile subprocess(via fork in unistd or Process in python.multiprocess) CUDA_API Profiling Linux Targets	6	1471	September 23, 2024
Nsight Compute with Pytorch Nsight Compute pytorch , profiling	4	741	August 23, 2024
How to profile several processes by NCU at the same time? Nsight Compute	4	914	March 8, 2023
Nsys hangs when profiling any cuda process Profiling Linux Targets cuda	1	311	August 11, 2025

Nsight System Profiling two CUDA python(i.e. Pytorch) processes using the same GPU simultaneously

Related topics