Profile cuda kernel

TrailingStop · January 4, 2023, 8:26am

Hi,

I try to find out if modifications I did on a kernel gives me better or worst performance. I run the kernel several hounds times and use the average time it needs to compare the different kernel. I guess that these timings depends on the core and memory clock values. Since the core clock drops if the temp of the gpu increases I get better results when I start with a cool gpu than I will get with a hot one. So this way to compare kernel performance is not very objective.

Is there a better way to find out which kernel performs better independent from core/mem clock and gpu temps?

Thanks,
Daniel

njuffa · January 4, 2023, 11:22am

Have you had a chance to read through NVIDIA’s recommendations regard this topic?

TrailingStop · January 4, 2023, 12:47pm

Hi - great. Thanks.

rs277 · January 4, 2023, 5:58pm

Not all of the nvidia-smi commands mentioned in the document, are supported on some Geforce cards and although the document only mentions the “Lock clocks to base” setting in Nsight Graphics, there is a similar setting in Nsight Compute.

TrailingStop · January 4, 2023, 9:58pm

Hi- Can you point me to the location of this lock setting in nsight compute? Couldn’t find it.

Robert_Crovella · January 4, 2023, 10:16pm

see here

That entire section on “reproducibility” may be of interest. Also see here

Questions about nsight compute may get better support on the nsight compute forum.

rs277 · January 5, 2023, 12:49am

It’s the last setting in this window here - “Clock Control”

system · January 19, 2023, 12:50am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Unstable performance measured by cuda event CUDA Programming and Performance	3	443	December 6, 2022
Is there any tool which can tell my kernel is compute bound or memory bound CUDA Programming and Performance	7	5971	April 3, 2010
CPU vs GPU performance CUDA Programming and Performance	3	479	December 16, 2018
Kernel pipeline slows gradually CUDA Programming and Performance	11	53	December 21, 2024
Same kernel and data exhibits different performance CUDA Programming and Performance	3	479	December 3, 2021
How to tell if a kernel is memory or compute bound CUDA Programming and Performance	8	9298	February 4, 2010
kernel runs much faster when being profiled with Visual Profiler Visual Profiler and nvprof	4	4687	August 29, 2014
Kernel execution measurement - profiling CUDA Programming and Performance	3	230	May 5, 2024
CUDA kernel is 6x slower in model than in a separate benchmark CUDA Programming and Performance cuda , kernel	6	436	February 17, 2023
Optimizing one kernel affects the performance of other kernels? CUDA Programming and Performance	3	410	December 6, 2021

Profile cuda kernel

Related topics