With Nsight Systems, executable doesn't run as expected

Basically we have a bash script to launch a predictor server (once the server is up, the replayer will be launched on another devserver). The script works well to evaluate performance for our models. However, we want to add nsys into the script to trace the system performance. Here is how we use the nsys in our script(skip the lines before this):

When we run the script with this setting, it basically runs for a few seconds and exits from the script without generating anything, no report, no information as the old script before adding this, no error messages.

If we run $CMD directly in the script, it works as previously.
If I change “nsys launch” into “nsys profile” and set the options correctly, it generates a report with a period of about 3s. But the script exits very quickly and the server is not up. As a result, we can’t launch our replayer.

I also tried the script without "nsys start "/“nsys stop”. It’s the same. Nothing was coming out.

Can any expert from the community help on this? Thanks!

Does your script launch the predictor server and then terminate or does it stay running while the predictor is active?

If it terminates, I think what is happening is that nsys is tracing the script, and since the script probably does not contain any CUDA/nvtx, we are not collecting any results.

are you catching stdout/err?

I don’t think it launches the predictor server as the script terminates very quickly, within several seconds. There is no stdout/err in the terminal. Even I added “2>&1 | tee server.log” to generate a log file, it didn’t generate anything.

Even when I tried the following two ways to run nsys, they didn’t work:
nsys launch -w true -t cuda,nvtx,cudnn,cublas ./launch_server.sh
where launch_server.sh is the script which can run successfully itself.

nsys launch -w true -t cuda,nvtx,cudnn,cublas /root/jasonjk/bin/predictor_run
where predictor_run is the executable in a binary format

In either case, the command existed in less than 10 seconds and left nothing on screen or in the empty log file.

Hi @AlbertHu, nsys launch is meant to be used with nsys start. If you want to profile the application immediately after it starts, please try nsys profile instead of nsys launch. More details here in the user guide: User Guide :: Nsight Systems Documentation

I have a separate script for nsys start and nsys stop as I want to profile the application after I launch the replayer. However, I have to launch the predictor server before I launch the replayer on a second GPU server. My question is about making nsys work with my application first.

I actually tried “nsys profile”. See my original post about it.

This behaviour is likely to be your application crashing when Nsight Systems is actively profiling it, we’re experiencing the same issue sometimes, and attaching gdb to your predictor binary as it gets spun up by Nsight Systems may be helpful here in order to identify why your application is exiting.