Profiling on Bluefield-2 DPU

We are running a server on the Arm processor that receives jobs from a remote host and runs them on the GPU.

I’d like to profile these jobs with nsight-systems. However, the application - when run - only launches the server and then waits for a job (i.e. nsys profile application does not detect any activity on the GPU). GPU activity must be triggered by the remote host after the application runs on the DPU.

So far, I cannot find any way to profile these jobs using nsight-systems. Any help is greatly appreciated.

@afroger any thoughts?

If I understand correctly:

  • The server is running a service process whose responsibility is to listen to some form of IPC or socket waiting for work to be scheduled
  • The client processes are communicating with the service process running on the server to tell him to execute some work

In such case, you need to run “nsys profile” or “nsys launch” to start the service process running on the server. A lot of Nsight Systems’ feature require some tool libraries to be injected in the processes. The only way to do that is the launch the process you want to profile with the GUI or CLI.

If you don’t want to profile for the whole lifetime of the program, you can always use interactive commands: “nsys launch”, “nsys start”, “nsys stop”, etc.

Does my explanation help in any way?

1 Like

This is helpful.
When I used “nsys profile” or “nsys launch” the profiler was not capturing GPU activity.
However, with “nsys profile --duration” I’m able to submit a job and execute the work within the duration – and the work is captured by the profiler.

Note that you can use interactive command “nsys start” and “nsys stop” on the server to control the profiling session of the service process.

For example:

  • Launch the service with the profiler — it is not profiled traced yet but you can use nsys profile instead to start profiling immediately:
$ nsys launch --session-new service [...]
  • Start the profiling session:
$ nsys start --session service
  • Stop the profiling session — an “.nsys-rep” file will be generated:
$ nsys stop --session service

You can use nsys start and nsys stop multiple times on the same profiling session. We also have capture trigger available through the --capture-range option. If the jobs include calls to “cudaProfilerStart” and “cudaProfilerStop”, you can set --capture-range cudaProfilerApi to automatically start and stop the profiling session.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.