I am trying to profile only selected kernels using Nsight Compute. Steps taken:
- Launch the application on target using
bazel run --config=cuda --run_under= "/path/to/ncu --target-processes all --mode=launch" //path/to/binary -- args
- Launch the MAC host Nsight compute and attach to remote.
There are a bunch of CUDA calls that I skip which takes around 30 mins every time (auto-profiling enabled). Is there a way for me to only profile specific kernels specified by their name for example in this mode?
There are various kernel filtering options from the command line based on kernel names (-k or --kernel-name) etc… including regex expressions. Nsight Compute CLI :: Nsight Compute Documentation
Some examples from that documentation include:
-k foo Match all kernels named exactly “foo”.
-k regex:foo Match all kernels that include the string “foo”, e.g. “foo” and “fooBar”.
-k regex:“foo|bar” Match all kernels including the strings “foo” or “bar”, e.g. “foo”, “foobar”, “_bar2”.
Right, I tried filtering the kernels using:
bazel run --config=cuda --config=cuda_profile --run_under="/path/to/ncu --target-processes all --mode=launch --kernel-name=regex:foo" //path/to/binary -- args
and the error popped up:
INFO: Build completed successfully, 1 total action
Invalid option --kernel-name specified for --mode launch
Use --help for further details.
It looks like specifying
kernel-name with the
launch mode is incompatible. Please correct me if I am wrong.
Is there a way to filter the kernels through the nsight compute GUI in
Interactive Profile mode? (Since I am able to attach to the remote process that way)
I tried adding a
Next trigger filter and enabled Auto-profile, unfortunately that did not profile the kernel. Attaching screenshot:
The decisions about what kernel to profile needs to be made by the attaching Nsight Compute, as opposed to the launching Nsight Compute, so the -kernel-name=regex:foo with the --mode=launch flags are kind of incompatible. In your screenshot, you should be able to use the filter, but you don’t need the “–kernel-name=” part. If you hover over “Next Trigger” you should see the ways you can set triggers, like this: