Why would code run 1.7x faster when run with nvprof than without?

Greg · December 21, 2017, 6:40am

This is a tough question to answer as you haven’t provided any details. What are you measuring? Kernel duration, process duration, etc. How are you measuring time? Is this GPU or CPU time? What are the reported times? What GPU is the code running on?

The profilers do change a few behaviors:

disable some power management when capturing PM counters
increase the GPU timer frequency from 1 MHz to 31.25 MHz
measure kernel execution time more precisely than is possible with CUDA events
increase CPU overhead
flush work to GPU faster (using a Windows, not Linux difference)

Topic		Replies	Views
Low or normal performance? CUDA Programming and Performance cuda	20	1400	November 13, 2020
nvprof --print-api-trace - puzzling outputs. Visual Profiler and nvprof	0	666	January 7, 2020
nvprof with tensorflow is suspiciously slow CUDA Programming and Performance	7	1604	January 19, 2019
NVProf error on samples CUDA Programming and Performance	28	20689	December 29, 2020
kernel runs much faster when being profiled with Visual Profiler Visual Profiler and nvprof	4	4741	August 29, 2014
Always got this warning when nvprof cuda file "This can happen if device ran out of memory or if a device kernel was stopped due to an assertion" on just HellowWorld GPU CUDA Programming and Performance	9	2694	January 31, 2019
nvprof is too slow Visual Profiler and nvprof	12	4942	January 25, 2022
Does nvprof support cudaTextureObject_t? CUDA Programming and Performance	9	735	October 30, 2019
Cuda kernel Visual Profiler and nvprof profiling	2	76	August 25, 2025
nvprof: Internal profiling error 4277:5 on Tesla P100, but not on GTX 1070 Visual Profiler and nvprof	12	4126	October 12, 2021

Why would code run 1.7x faster when run with nvprof than without?

Related topics