Which metric should I collect from ncu profiler if I want to get the IOPS (integer operations per second) for my kernel?

Suppose my kernel has only integer operations including memory access. Which metric should I collect from ncu profiler if I want to get the IOPS (integer operations per second) for my kernel?

For memory accesses, like loads and stores, the metrics don’t differentiate the datatype (like int vs float). I’m not sure if that’s what you meant by “including memory accesses”. For the execution, you can look at the Instruction Statistics section which will include a chart of all the opcodes. If you hover over one, you will be able to tell what type of instruction it is, including if it is an integer instruction. You could use those counts to determine this.

@jmarusarz I can find the count of integer instructions within the ‘Instruction Statistics’ section of NCU. However, I’m not sure how to retrieve the corresponding metric names for each type of instruction. This information would be helpful for extracting data from the raw profiling results and calculating IOPS. Currently, when I hover over an instruction in NCU, only the instruction name is displayed, without any accompanying metric names.

I also hope that NCU could include a section file for IOPS, similar to SpeedOfLight_HierarchicalDoubleRooflineChart.section.

The list you show, is generated from the “sass__inst_executed_per_opcode” metric. You may be able to get what you want from “sass__inst_executed_per_opcode_with_modifier_selective”

See the " Instructions Per Opcode Metrics" section, here.

The metric sass__inst_executed_per_opcode appears to represent the total count of executed instructions. However, I’m seeking a method to extract the count of a particular instruction type using a specific metric name, e.g. “IMAD.”

I tried to use the --print-metric-name=name option to establish this connection. The subdivision data of sass__inst_executed_per_opcode appears to be structured in a tuple format in a descending order. Nevertheless, I remain uncertain about the origin of this data for each kind of instruction.


Apologies, I misinterpreted the metric I referred you to. It appears individual opcode metrics aren’t available.

I think this is a gap in the CLI capabilities. As a workaround, you can open the Metrics Details Window, and put the sass__inst_executed_per_opcode metric in the search field. Then select all the rows below and Ctrl+C to copy to a clip board and paste elsewhere. You’ll get a CSV like this
Correlation ID,Instance Value