Kernel statisctics for different runs

Accelerated Computing CUDA CUDA Programming and Performance

mahmood.nt August 2, 2019, 7:14am 1

I have profiled an ML application with 1 (Pasteboard - Uploaded Image) and 3 (Pasteboard - Uploaded Image) epochs. As you can see in the pictures, the profile duration for the second run is larger than the first one. That is correct. However, for the largest kernel, the number of FP and INT and other instructions remain unchanged.

That is weird. Isn’t it?

Topic		Replies	Views
Question of NVIDIA CUDA Visual Profiler Version 2.2 CUDA Programming and Performance	1	1034	November 13, 2009
cuda profiler half time 0 instructions. CUDA Programming and Performance	0	1871	July 23, 2008
Question about NVIDIA CUDA Visual Profiler Version 2.2 CUDA Programming and Performance	0	2946	November 13, 2009
Profiler: Instruction Count Details about the instruction count field in the profiler output CUDA Programming and Performance	8	5274	January 15, 2009
Profiler speeding up my kernels? Nvidia employees please read Weird timing behavior during profiler CUDA Programming and Performance	6	5921	November 9, 2009
hardware events in profiler CUDA Programming and Performance	0	341	February 12, 2018
Profiler: Instruction count Details about the instruction count field in the profiler output CUDA Programming and Performance	1	1576	January 10, 2009
could somebody help me anlysis this result CUDA Programming and Performance	0	376	August 20, 2017
Different durations reported by nvprof for the same kernel. CUDA Programming and Performance	3	520	December 6, 2019
could somebody help me anlysis this result CUDA Programming and Performance	1	462	August 20, 2017

Kernel statisctics for different runs

Related topics