Which is better the ncu command option, "--launch-count" or "--kernel-id"?

I would like to get the “Achieved Occupancy(%)” for a kernel in my application.
In order to get the information, when I use “Nsight compute CLI”, my application is likely to spend much time than when I do not use “Nsight compute CLI”. I suppose it is due to expensive overhead cost of “Nsight compute”.

In order to reduce the overhead, I might want to thin out profile data of “Nsight compute”.
Which should I use the ncu command option, “–launch-count” or “–kernel-id” ?

Hi, @KOUCHI_Hiroyuki

You can refer 2. Kernel Profiling Guide — NsightCompute 12.5 documentation to check those factors that impact overhead.

You can refer 4. Nsight Compute CLI — NsightCompute 12.5 documentation to add filter in the command line to reduce overhead.

@veraj -san,
Thank you for the reply to me and for all the information.
I went to the overhead and checked.
According to the overhead, in order to reduce profiling overhead, I might want to
use the CLI option “–section” and “–launch-count”. Right ?

Hi, @KOUCHI_Hiroyuki

Yes. Your understanding is right.

-c [ --launch-count ] arg Limit the number of collected profile results. The count is only incremented for launches that match the kernel filters.

–section arg Collect the section by providing section identifier
–set arg Identifier of section set to collect. If not specified, the basic set is collected.

For example, you can use “ncu --set detailed -c 5 ./sample” to profile sample


Thank you for checking and the reply to me. I will do it.

This topic was automatically closed after 16 hours. New replies are no longer allowed.