Nsight Compute Clock Speed During Profiling

I’ve been profiling a kernel on an A100 GPU. NSight reports the SM Frequency of 1.09 cycles/nsecond which appears to correspond to the nvidia-smi output below. However, when I run same command while the application is running without NCU, it list 1410 MHz for the Graphics and SM clocks, which is the boost clock I expect. Is there a reason why the clock is lower when profiling vs an actual run? This results in incorrect rooftop profiles in Nsight Compute do the artificially low clock-speed.

Thanks
Gaetan

nvidia-smi -q -d CLOCK -i 0

==============NVSMI LOG==============

Timestamp                                 : Wed Mar 16 16:06:13 2022
Driver Version                            : 470.57.02
CUDA Version                              : 11.4

Attached GPUs                             : 8
GPU 00000000:07:00.0
    Clocks
        Graphics                          : 1095 MHz
        SM                                : 1095 MHz
        Memory                            : 1215 MHz
        Video                             : 585 MHz
    Applications Clocks
        Graphics                          : 1095 MHz
        Memory                            : 1215 MHz
    Default Applications Clocks
        Graphics                          : 1095 MHz
        Memory                            : 1215 MHz
    Max Clocks
        Graphics                          : 1410 MHz
        SM                                : 1410 MHz
        Memory                            : 1215 MHz
        Video                             : 1290 MHz
    Max Customer Boost Clocks
        Graphics                          : 1410 MHz
    SM Clock Samples
        Duration                          : Not Found
        Number of Samples                 : Not Found
        Max                               : Not Found
        Min                               : Not Found
        Avg                               : Not Found
    Memory Clock Samples
        Duration                          : Not Found
        Number of Samples                 : Not Found
        Max                               : Not Found
        Min                               : Not Found
        Avg                               : Not Found
    Clock Policy
        Auto Boost                        : N/A
        Auto Boost Default                : N/A

Hello,
I’m moving this post to the Nsight Compute forum since it looks like you were using Nsight Compute instead of Nsight Graphics.
Regards,

You can find information on Nsight Compute’s clock control behavior in the documentation. It is expected that clocks are locked to their base value by default, which is relatively low on e.g. A100, to provide a controlled and deterministic environment for data collection. If you require a specific frequency, you can disable clock control in Nsight Compute and lock the clocks using nvidia-smi externally. Note that profiling without locked clocks can result in incorrect metric values.

Hi Felix

Thanks your reply. That makes perfect sense.
Regards,
Gaetan

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.