Using Nsight Sytems for Large Scale Profiling

Ziqi · October 7, 2022, 4:04am

I am working on an HPC project which requires me to profile a system involving 48 physical CPU cores, 5 GPUs, 40 CPU processes and 10 GPU processes at most. As I observed, the smaller the numbers of processes, the higher chance Nsight Sytem succeeds in profiling the system. Usually, at full scale, Nsight System crashes. I really would like to profile the system at full scale without crash. Any advice?

mjain · October 7, 2022, 5:22am

Moved to the Nsight Systems forum

hwilper · October 14, 2022, 10:40pm

How long are you profiling for? Can you give me the CLI command you are using (or tell me the GUI options)?

Ziqi · October 16, 2022, 12:38am

nsys profile -f true --trace-fork-before-exec=true -o /home/zfan/sandbox/profile/smaq_96cpu_10gpu_50c_10h_1t ./LeafStandAlone.x86-64 -noForcedPatches /home/zfan/sandbox/JobDump/MCAT_SP_220M/JobInfo_26_220 /home/zfan/sandbox/JobDump/MCAT_SP_220M/JobInfo_26_220.xml 20

hwilper · October 18, 2022, 7:23pm

So you ran the default sample and trace options on your application until you cancelled. How long did it run before crashing? Did it leave a .qdstrm or .nsys-rep file?