RTX 2070 doesn't report running at max clock rate in the P0 state.

patrick.decherf · January 3, 2020, 12:42pm

RTX 2070 doesn’t report running at max clock rate in the P0 state.
nvidia-smi : Clock graphics 1410 MHz, max clock graphics 2100 MHz.
Is there a way to have the higher clock rate ?
Thanks Patrick

±----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 2844 G /usr/lib/xorg/Xorg 40MiB |
| 0 3157 G /usr/bin/gnome-shell 115MiB |
| 0 17987 G /usr/lib/xorg/Xorg 304MiB |
| 0 18128 G /usr/bin/gnome-shell 197MiB |
| 0 31232 C+G ./bin/VulkanVideoIO 120MiB |
±----------------------------------------------------------------------------+

tc@tc-Super-Server083:~/Desktop$ nvidia-smi -i 0 -q -d CLOCK

==============NVSMI LOG==============

Timestamp : Fri Jan 3 11:51:12 2020
Driver Version : 430.26
CUDA Version : 10.2

Attached GPUs : 1
GPU 00000000:65:00.0
Clocks
Graphics : 1410 MHz
SM : 1410 MHz
Memory : 7000 MHz
Video : 1305 MHz
Applications Clocks
Graphics : N/A
Memory : N/A
Default Applications Clocks
Graphics : N/A
Memory : N/A
Max Clocks
Graphics : 2100 MHz
SM : 2100 MHz
Memory : 7001 MHz
Video : 1950 MHz
Max Customer Boost Clocks
Graphics : N/A
SM Clock Samples
Duration : Not Found
Number of Samples : Not Found
Max : Not Found
Min : Not Found
Avg : Not Found
Memory Clock Samples
Duration : Not Found
Number of Samples : Not Found
Max : Not Found
Min : Not Found
Avg : Not Found
Clock Policy
Auto Boost : N/A
Auto Boost Default : N/A

anon56509511 · January 3, 2020, 2:42pm

GPU load is only “29%” so GPU is running at “P0” base core clock 1410Mhz (see https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units). Run some GPU load (like https://benchmark.unigine.com/) to request clock boost. But you will be limited by maximum power throttling (175W) or temperature throttling (80C) and you can achieve 1620-2100MHz clock.

patrick.decherf · January 6, 2020, 10:25am

Thanks for the feedback.
I have a compute shader running for 536 us in a latency sensitive application. (at a clock of 1400 MHz). I would like to temporal increase the frequency to the boost frequency ( 2100 MHz ? ). To reduce the latency further. Currently the power consumption is 35 Watt, the core temperature is 55C. There is still margin here.
The application is written in CPP and using the Vulkan frame work.

anon56509511 · January 6, 2020, 1:47pm

# nvidia-smi -q
...
    Product Name                    : GeForce RTX 2070 SUPER
    Performance State               : P0
    Power Readings
        Power Draw                  : 59.68 W
    Clocks
        Graphics                    : 1605 MHz
        SM                          : 1605 MHz
        Memory                      : 7000 MHz
        Video                       : 1485 MHz
...
#  nvidia-smi -lgc 1900,1900
Gpu clocks set to "(gpuClkMin 1900, gpuClkMax 1900)" for GPU 00000000:01:00.0
All done.
# nvidia-smi -q
...
    Performance State               : P0
    Power Readings
        Power Draw                  : 82.83 W
    Clocks
        Graphics                    : 1905 MHz
        SM                          : 1905 MHz
        Memory                      : 7000 MHz
        Video                       : 1770 MHz
...
# nvidia-smi -rgc
All done.
# nvidia-smi -q
    Performance State               : P0
    Power Readings
        Power Draw                  : 59.68 W
    Clocks
        Graphics                    : 1605 MHz
        SM                          : 1605 MHz
        Memory                      : 7000 MHz
        Video                       : 1485 MHz

PS: You can fix “P0” state forever in kernel driver - https://gridforums.nvidia.com/default/topic/378/#14573.

patrick.decherf · January 7, 2020, 7:49am

Thanks, it is now running faster.
orignal: 1410 MHz, compute shader takes 536 us, power used is 38.33 W, Temperature 55 C, Fan speed 0 %
new settings: 1860 MHz, compute shader takes 408 us, power used is 54.96 W, Temperature 53 C, Fan speed 47%

nvidia-smi reports a maximum clock of 2100 MHz, the manufacture (Gigabyte) website shows 1620 MHz as boost clock. It is now running at 1860 MHz. Still a bit confusing.
It consumes now an extra 142 kWh a year, this would cost me roughly 14 euro’s a year. As I need only 2.5% of the time this fast clock, it would be interesting to dynamic change the clocks. No idea how I could trigger this from the Vulkan frame work.

details:
sudo nvidia-smi -q
==============NVSMI LOG==============

Timestamp : Tue Jan 7 07:47:10 2020
Driver Version : 430.26
CUDA Version : 10.2

Attached GPUs : 1
GPU 00000000:65:00.0
Product Name : GeForce RTX 2070
…
Temperature
GPU Current Temp : 55 C
GPU Shutdown Temp : 94 C
GPU Slowdown Temp : 91 C
GPU Max Operating Temp : 89 C
Memory Current Temp : N/A
Memory Max Operating Temp : N/A
Power Readings
Power Management : Supported
Power Draw : 38.33 W
Power Limit : 175.00 W
Default Power Limit : 175.00 W
Enforced Power Limit : 175.00 W
Min Power Limit : 125.00 W
Max Power Limit : 200.00 W
Clocks
Graphics : 1410 MHz
SM : 1410 MHz
Memory : 7000 MHz
Video : 1305 MHz

sudo nvidia-smi -lgc 2100,2100
sudo nvidia-smi -q
Temperature
GPU Current Temp : 53 C
GPU Shutdown Temp : 94 C
GPU Slowdown Temp : 91 C
GPU Max Operating Temp : 89 C
Memory Current Temp : N/A
Memory Max Operating Temp : N/A
Power Readings
Power Management : Supported
Power Draw : 54.96 W
Power Limit : 175.00 W
Default Power Limit : 175.00 W
Enforced Power Limit : 175.00 W
Min Power Limit : 125.00 W
Max Power Limit : 200.00 W
Clocks
Graphics : 1860 MHz
SM : 1860 MHz
Memory : 7000 MHz
Video : 1725 MHz
sudo lspci -v -s 65:0
[sudo] password for tc:
65:00.0 VGA compatible controller: NVIDIA Corporation Device 1f02 (rev a1) (prog-if 00 [VGA controller])
Subsystem: Gigabyte Technology Co., Ltd Device 37c2
Flags: bus master, fast devsel, latency 0, IRQ 277, NUMA node 0
Memory at df000000 (32-bit, non-prefetchable)
Memory at c0000000 (64-bit, prefetchable)
Memory at d0000000 (64-bit, prefetchable)
I/O ports at b000
[virtual] Expansion ROM at 000c0000 [disabled]
Capabilities: [60] Power Management version 3
Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+
Capabilities: [78] Express Legacy Endpoint, MSI 00
Capabilities: [100] Virtual Channel
Capabilities: [250] Latency Tolerance Reporting
Capabilities: [258] L1 PM Substates
Capabilities: [128] Power Budgeting <?> Capabilities: [420] Advanced Error Reporting Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
Capabilities: [900] #19
Capabilities: [bb0] #15
Kernel driver in use: nvidia
Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia

anon56509511 · January 7, 2020, 10:09am

You can also increase “Enforced Power Limit” (with “-pl 200” option). Boost clocking is automatic, so if your “compute shader” runs longer the load-clock control loop can increase (boost) clock automatically for your program (if not locked by “-lgc 2100,2100” option). But it seems that your program is too short to react properly. I suppose that there is limited knowledge about load-clock algorithm https://lmgtfy.com/?q=nvidia+boost+4.0.

Topic		Replies	Views
SM Clock on RTX A6000 never reaches max frequency CUDA Programming and Performance nvidia-smi	4	5348	February 18, 2022
Limited clock for the new RTX3090Ti + Ubuntu 20.04 CUDA Programming and Performance	15	3181	December 5, 2022
Having Trouble OverClocking GTX 1070 CUDA Setup and Installation	22	33501	September 25, 2017
GTX 1070 nvidia-smi P states and Application Clocks support System Management and Monitoring (NVML)	1	2859	December 20, 2017
gtx 750 ti nvidia clocks Linux	0	1436	February 23, 2017
One weird trick to get a Maxwell v2 GPU to reach its max memory clock ! CUDA Programming and Performance	59	18094	April 22, 2016
BUG 378/381: nvml/nvidia-smi core clock is wrong on pascal devices Linux	22	14097	October 14, 2021
Powermizer on the GTX970 Linux	16	13364	July 4, 2016
980Ti Boost + Clock issue Linux	10	7882	September 26, 2015
K80 application clock limited to 562 Mhz CUDA Setup and Installation	18	4271	March 2, 2021

RTX 2070 doesn't report running at max clock rate in the P0 state.

Related topics