Nsight Compute Clock Speed During Profiling

Gaetan · March 16, 2022, 11:15pm

I’ve been profiling a kernel on an A100 GPU. NSight reports the SM Frequency of 1.09 cycles/nsecond which appears to correspond to the nvidia-smi output below. However, when I run same command while the application is running without NCU, it list 1410 MHz for the Graphics and SM clocks, which is the boost clock I expect. Is there a reason why the clock is lower when profiling vs an actual run? This results in incorrect rooftop profiles in Nsight Compute do the artificially low clock-speed.

Thanks
Gaetan

nvidia-smi -q -d CLOCK -i 0

==============NVSMI LOG==============

Timestamp                                 : Wed Mar 16 16:06:13 2022
Driver Version                            : 470.57.02
CUDA Version                              : 11.4

Attached GPUs                             : 8
GPU 00000000:07:00.0
    Clocks
        Graphics                          : 1095 MHz
        SM                                : 1095 MHz
        Memory                            : 1215 MHz
        Video                             : 585 MHz
    Applications Clocks
        Graphics                          : 1095 MHz
        Memory                            : 1215 MHz
    Default Applications Clocks
        Graphics                          : 1095 MHz
        Memory                            : 1215 MHz
    Max Clocks
        Graphics                          : 1410 MHz
        SM                                : 1410 MHz
        Memory                            : 1215 MHz
        Video                             : 1290 MHz
    Max Customer Boost Clocks
        Graphics                          : 1410 MHz
    SM Clock Samples
        Duration                          : Not Found
        Number of Samples                 : Not Found
        Max                               : Not Found
        Min                               : Not Found
        Avg                               : Not Found
    Memory Clock Samples
        Duration                          : Not Found
        Number of Samples                 : Not Found
        Max                               : Not Found
        Min                               : Not Found
        Avg                               : Not Found
    Clock Policy
        Auto Boost                        : N/A
        Auto Boost Default                : N/A

dwoods · March 16, 2022, 11:19pm

Hello,
I’m moving this post to the Nsight Compute forum since it looks like you were using Nsight Compute instead of Nsight Graphics.
Regards,

felix_dt · March 17, 2022, 6:45am

You can find information on Nsight Compute’s clock control behavior in the documentation. It is expected that clocks are locked to their base value by default, which is relatively low on e.g. A100, to provide a controlled and deterministic environment for data collection. If you require a specific frequency, you can disable clock control in Nsight Compute and lock the clocks using nvidia-smi externally. Note that profiling without locked clocks can result in incorrect metric values.

Gaetan · March 17, 2022, 8:18pm

Hi Felix

Thanks your reply. That makes perfect sense.
Regards,
Gaetan

system · March 31, 2022, 8:18pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
SM frequency reported in Nsight Compute Nsight Compute	4	874	September 1, 2023
Nsight Compute: The frequency is not fixed Nsight Compute	4	1102	May 19, 2024
Nsight Compute slows down Tesla T4 processor clock during profiling Nsight Compute	5	797	October 12, 2021
GPU SM Frequency CUDA Programming and Performance	2	178	August 15, 2024
About the flops in ncu report Nsight Compute	11	3571	July 29, 2024
Driver (510.47.03) for A100 Performance Regression Linux cuda	1	553	April 23, 2024
What exactly does SM Active Cycles mean? Nsight Compute	3	696	July 30, 2024
Is Cycles dependent on Time or reverse? Nsight Compute	1	539	November 30, 2021
Nsight leaks a call to lowering GPU clocks even after Nsight has closed Nsight Graphics	9	609	October 12, 2021
Unstable performance measured by cuda event CUDA Programming and Performance	3	443	December 6, 2022

Nsight Compute Clock Speed During Profiling

Related topics