Profile multiple kernels, once each

I’m using ncu, and I’d like to profile multiple kernels. I know I can do that with the --kernel-name flag and regex, e.g.,

--kernel-name regex:"volta_scudnn_128x32_3dconv_wgrad_xregs_large_nn_v1|volta_scudnn_128x32_3dconv_fprop_small_nn_v1|bn_fw_tr_1C11_kernel_NCHW|bn_bw_1C11_kernel_new|volta_scudnn_128x64_3dconv_fprop_xregs_large_nn_v1"

The problem is if I use -c 1 to only profile once then that applies to all kernels - so whichever one of those five is launched first gets profiled. I can use -c 5, but then if a kernel gets launched multiple times before another kernel in the list I’ll still miss profiling one.

Can I specify multiple kernel names, and have each name profiled once?

Thanks

Hi, @user22230

Thanks for using Nsight Compute.
Regarding your request, the recommendation is to use --kernel-id

Details can refer Training — NsightCompute 12.4 documentation

This has the same problem as --kernel-name: I can specify multiple kernels with regex but the -c argument applies to all of them as a group, not individually.

Hi @user22230
Please confirm if you used –kernel-id option instead of --kernel-name option like below,
–kernel-id :: regex:"volta_scudnn_128x32_3dconv_wgrad_xregs_large_nn_v1|volta_scudnn_128x32_3dconv_fprop_small_nn_v1|bn_fw_tr_1C11_kernel_NCHW|bn_bw_1C11_kernel_new|volta_scudnn_128x64_3dconv_fprop_xregs_large_nn_v1":1

If you are willing to profile all first instances of kernels matching with provided regex then do not use -c option along with it.
-c will limit the total number of profiled kernel launches. For example if you provide -c 3 with above option then NCU will profile 3 kernels in total which will match above --kernel-id filter.

You can refer profile filter options here 4. Nsight Compute CLI — NsightCompute 12.4 documentation

I hope this will help you out.

Thanks.

Thanks, that gets closer. I get this output now, using the option --kernel-id ::regex:"gemv2N_kernel|dgrad_engine|dgrad2d_alg1_1":1 (different kernels than original post):

==PROF== Profiling “1:13:gemv2N_kernel:1”: 0%…50%…100% - 1 pass
==PROF== Profiling “1:14:gemv2N_kernel:1”: 0%…50%…100% - 1 pass
==PROF== Profiling “1:7:dgrad_engine:1”: 0%…50%…100% - 1 pass
==PROF== Profiling “1:7:dgrad2d_alg1_1:1”: 0%…50%…100% - 1 pass
==PROF== Profiling “1:25:gemv2N_kernel:1”: 0%…50%…100% - 1 pass
==PROF== Profiling “1:26:gemv2N_kernel:1”: 0%…50%…100% - 1 pass

What is the second number in 1:number:kernel name:1? A couple kernels are profiled twice but this number is different, does that indicate that the kernels are different somehow (despite having the same name)?

Hello @user22230,
Good to see the progress.

What is the second number in 1:number:kernel name:1?

It is a stream number. "1:13:gemv2N_kernel:1" is in the same format as the --kernel-id argument’s format context-id:stream-id:kernel-name:invocation-nr

A couple kernels are profiled twice but this number is different, does that indicate that the kernels are different somehow (despite having the same name)?

Right, those kernels are different in the sense those are from different streams.

Best Regards.

Perfect, this is what I want. Thanks!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.