need clarity in definition of inst_per_warp

bharat629 · June 19, 2017, 11:43pm

As per my understanding of nvprof guide, instruction per warp or inst_per_warp is the number of instruction per thread * 32 (assuming no branches and every thread follow same path). That is why I am getting results of order 10 to power 4. Can anyone tell me is my understanding correct because there is no descriptive documentation available regarding the various device-query instructions.
What is the significance of inst_per_warp ?

==32315== NVPROF is profiling process 32315, command: ./a.out
==32315== Profiling application: ./a.out
==32315== Profiling result:
==32315== Metric result:
Invocations                               Metric Name                        Metric Description         Min         Max         Avg
Device "Tesla K20m (0)"
	Kernel: intLatKernel(__int64*, __int64*)
          1                             inst_per_warp                     Instructions per warp  1.3020e+04  1.3020e+04  1.3020e+04

Thanks

LongY · June 24, 2017, 3:41am

According to CUPTI ([url]CUPTI :: CUDA Toolkit Documentation), the definition of inst_per_warp is average number of instructions executed by each warp.

Topic		Replies	Views
What are the meanings of the items in nvprof --metrics all? CUDA Programming and Performance	0	424	October 31, 2018
About instruction per warp metric CUDA Programming and Performance	6	808	October 12, 2021
what is IPC(Instructions Per Cycle)? CUDA Programming and Performance	2	3044	October 15, 2018
how to calculate theoretical fp32 instructions per cycle (IPC) on nvidia GPU CUDA Programming and Performance	6	5428	July 9, 2017
nvprof -- cf_executed and inst_control CUDA Programming and Performance	2	1126	May 7, 2020
Understanding difference between instructions issued 1 and instructions issued 2 in computeprof (CUD CUDA Programming and Performance	6	1736	April 16, 2013
Threads per warp vs number of cores CUDA Programming and Performance	2	2602	February 3, 2009
nvprof active_cycles vs elapsed_cycles_sm CUDA Programming and Performance	3	2533	August 27, 2016
Basic question about warps CUDA Programming and Performance	14	6572	June 9, 2009
Warp thread Scheduling CUDA Programming and Performance	7	2243	June 28, 2010

need clarity in definition of inst_per_warp

Related topics