I am trying to profile the inference of alexnet on pytorch using Nsight. When I directly run the python script, it takes 47 seconds. However, when I run it using nsys (exact command below), it takes way too long. The length of the output file is 11 minutes. I do not believe that it took 11 minutes for inference. What could be the cause of this delay?

sudo nsys profile python3 inference.py

Sometimes, the following warning shows up.

The target application terminated. One or more process it created re-parented.
Waiting for termination of re-parented processes.
Use the `--wait` option to modify this behavior.


The profiling tool will add some extra work to collect the required trace.
It’s expected that it will take a longer time to finish.

However, you don’t need to run the inference with nsys every time.
nsys just help to debug/optimize when you need it.


