measure integer instructions by nvprof

llodds · June 11, 2019, 1:54am

Background: I am counting flops of my application on GPU. I assume CUDA core also performs integer operations. I know the metric to measure single precision FLOPS is flop_count_sp. What is the metric name for measuring the total number of integer arithematic instructions?

Also, does integer add have the same latency as SPFP add? Where can I find those latency information?

Thanks,
M.

Robert_Crovella · June 11, 2019, 2:40am

The nvprof profiler metrics reference is here:

[url]Profiler :: CUDA Toolkit Documentation

The metric you are looking for may be inst_integer

Instruction latency is not published by NVIDIA anywhere, that I am aware of.

You can get an estimate of relative throughput of some instruction by looking at table 2 in the programming guide:

[url]Programming Guide :: CUDA Toolkit Documentation

Topic		Replies	Views
performance of integer vs float CUDA Programming and Performance	10	21726	June 15, 2009
Benchmarking a program What is the best option for finding the FLOP for a given thread? CUDA Programming and Performance	10	1247	August 21, 2010
profiler instruction count CUDA Programming and Performance	0	3823	November 3, 2009
instruction or operation CUDA Programming and Performance	16	3634	March 28, 2019
Counting flops what's in and what's out? CUDA Programming and Performance	0	1791	June 9, 2012
Flop/s measurement CUDA Programming and Performance	2	5394	September 14, 2010
hardware events in profiler CUDA Programming and Performance	0	322	February 12, 2018
confusion about nvprof documentation CUDA Programming and Performance	1	1095	November 18, 2013
evaluate the FLOPS CUDA Programming and Performance	5	2052	November 25, 2008
About instruction throughputs CUDA Programming and Performance	9	5182	May 27, 2010

measure integer instructions by nvprof

Related topics