CUDA Compute Profiling results What is the profiling mechanism of GPU performance counters?

thanh_tuan · May 17, 2011, 11:35am

Hi,
Can anyone explain me the mechanism how performance counters work on GPU? Does each SM has one set of performance counters?
The reason I ask this because I want to know the number of instruction of ONE specific block (workgroup on OpenCL).
In some cases, the output number of instructions of a kernel having 1 block is not different with the same kernel having 4 blocks. I guess the results is sampled on one SM and then multiplied by the number of SMs, is that true?
If so, assuming there are 16 SMs. So if all blocks do similar things (not too much variances on the number of instructions), the result of setting up a kernel with 1 blocks must be the same with 16 blocks, right? If so, what about if the kernel contains 17 blocks?

Thanks,
Tuan

thanh_tuan · May 19, 2011, 6:16am

Any help?

Topic		Replies	Views
need help on using cuda profiler CUDA Programming and Performance	0	498	May 17, 2011
why does performance degrade when launching a CUDA kernel with large number of blocks? CUDA Programming and Performance	1	611	November 11, 2019
CUDA Profiler Question about CUDA profiler CUDA Programming and Performance	1	1306	January 9, 2009
understand the mapping of the block threads to SMs in GPU CUDA Programming and Performance	3	2708	August 2, 2018
Request clarification on CUDA runtime scheduling CUDA Programming and Performance	1	1748	September 5, 2008
How do multiple CUDA programs run? CUDA Programming and Performance	1	972	October 23, 2013
how are blocks scheduled for execution? CUDA Programming and Performance	3	3412	December 9, 2016
How to control that kernel is executed on GPU? performance measurement of a CUDA kernel CUDA Programming and Performance	2	3709	April 8, 2008
Discrepency between multiprocessor, CUDA Core and kernel function CUDA Programming and Performance	2	682	September 3, 2011
How to specific the number of SMs used in my program? CUDA Programming and Performance	1	805	April 9, 2018

CUDA Compute Profiling results What is the profiling mechanism of GPU performance counters?

Related topics