I’m interested in how nvprof calculates metrics from events. Is there some reference documentation that talks about how this is done? I realize it’s GPU-specific, as different GPUs expose different event counters.
In particular, I’m interested in how nvprof is able to calculate metrics like flop_count_sp_add on my GeForce GTX 560Ti (Fermi). I can’t see anything floating-point related in the available events - all I see are event counters related to instructions issued in the 2 arithmetic pipelines, which handle both integer and floating-point instructions. Does nvprof insert triggers into various basic blocks in the code and then post-process to calculate individual types of instructions? Or am I completely missing something here? Thanks!