How to force performance level 3 (driver 381.22, with GTX 1080 Ti)

I’ve been pulling my hair out for a while now. I have two 1080 Ti. Driver is 381.22 on Ubuntu 16.04.2 (from the drivers PPA). I tried a few things in xorg.conf to no avail. While in idle, I can see it going to performance level 3 from time to time, sporadically, but every time the card starts doing some intense GPU work, it goes to performance level 2 and stays there and never goes to performance level 3.

Frustratingly enough, I can force it to level 0 by changing the 0x1 into 0x3 for PowerMizerDefault and PowerMizerDefaultAC.

Temperature is good (72C). I’ve tried multiple cards, the Gigabyte Xtreme, MSI Aero OC, etc. I’ve set the power limit to 300W for all of them.

Is there any way to force it to performance level 3?

Here’s my xorg.conf:

# nvidia-xconfig: X configuration file generated by nvidia-xconfig
# nvidia-xconfig:  version 381.22  (buildmeister@swio-display-x86-rhel47-02)  Thu May  4 01:29:00 PDT 2017

Section "ServerLayout"
    Identifier     "Layout0"
    Screen      0  "Screen0"
    Screen      1  "Screen1" RightOf "Screen0"
    InputDevice    "Keyboard0" "CoreKeyboard"
    InputDevice    "Mouse0" "CorePointer"
    Option         "Xinerama" "0"
EndSection

Section "InputDevice"
    Identifier     "Mouse0"
    Driver         "mouse"
    Option         "Protocol" "auto"
    Option         "Device" "/dev/psaux"
    Option         "Emulate3Buttons" "no"
    Option         "ZAxisMapping" "4 5"
EndSection

Section "InputDevice"
    Identifier     "Keyboard0"
    Driver         "kbd"
EndSection

Section "Monitor"
    Identifier     "Monitor0"
    VendorName     "Unknown"
    ModelName      "Unknown"
    HorizSync       28.0 - 33.0
    VertRefresh     43.0 - 72.0
    Option         "DPMS"
EndSection

Section "Monitor"
    Identifier     "Monitor1"
    VendorName     "Unknown"
    ModelName      "Unknown"
    HorizSync       28.0 - 33.0
    VertRefresh     43.0 - 72.0
    Option         "DPMS"
EndSection

Section "Device"
    Identifier     "Device0"
    Driver         "nvidia"
    #VendorName     "NVIDIA Corporation"
    #BoardName      "GeForce GTX 1080 Ti"
    BusID          "PCI:1:0:0"
    Option         "Coolbits" "31"
    Option         "RegistryDwords" "PowerMizerEnable=0x1; PerfLevelSrc=0x2222; PowerMizerLevel=0x1; PowerMizerDefault=0x1; PowerMizerDefaultAC=0x1"
EndSection

Section "Device"
    Identifier     "Device1"
    Driver         "nvidia"
    #VendorName     "NVIDIA Corporation"
    #BoardName      "GeForce GTX 1070"
    BusID          "PCI:5:0:0"
    Option         "Coolbits" "31"
    Option         "RegistryDwords" "PowerMizerEnable=0x1; PerfLevelSrc=0x2222; PowerMizerLevel=0x1; PowerMizerDefault=0x1; PowerMizerDefaultAC=0x1"
    Option         "ConnectedMonitor" "DFP-0"
    Option         "CustomEDID" "DFP-0:/etc/X11/edid.bin"
EndSection

Section "Screen"
    Identifier     "Screen0"
    Device         "Device0"
    Monitor        "Monitor0"
    DefaultDepth    24
    Option         "AllowEmptyInitialConfiguration" "True"
    SubSection     "Display"
        Depth       24
    EndSubSection
EndSection

Section "Screen"
    Identifier     "Screen1"
    Device         "Device1"
    Monitor        "Monitor1"
    DefaultDepth    24
    Option         "AllowEmptyInitialConfiguration" "True"
    SubSection     "Display"
        Depth       24
    EndSubSection
EndSection

Please run nvidia-bug-report.sh and attach output to your post.
What does
nvidia-smi -q -d PERFORMANCE
tell?

I’ll attach the log of the bug report later when I’ll fetch it from the remote server. I wouldn’t want to post the whole bug report output publicly - do you have an email address?

I note that I wasn’t necessarily reporting a bug, but mostly asking how to force performance level 3 at all times. Does anyone know?

# nvidia-smi -q -d PERFORMANCE

==============NVSMI LOG==============

Timestamp                           : Tue May 30 14:02:15 2017
Driver Version                      : 381.22

Attached GPUs                       : 2
GPU 0000:01:00.0
    Performance State               : P2
    Clocks Throttle Reasons
        Idle                        : Not Active
        Applications Clocks Setting : Active
        SW Power Cap                : Not Active
        HW Slowdown                 : Not Active
        Sync Boost                  : Not Active
        Unknown                     : Not Active

GPU 0000:05:00.0
    Performance State               : P2
    Clocks Throttle Reasons
        Idle                        : Not Active
        Applications Clocks Setting : Active
        SW Power Cap                : Not Active
        HW Slowdown                 : Not Active
        Sync Boost                  : Not Active
        Unknown                     : Not Active

Fixed max performance should be enabled with
PowerMizerEnable=0x1; PerfLevelSrc=0x2222; PowerMizerDefault=0x3; PowerMizerDefaultAC=0x1
But you have Application Clock Setting throttling active. A full nvidia-smi -q should show you the default and current settings. You then have to reset/set them with nvidia-smi -rac/-ac to the values you want.

Many thanks for your answers. First, how do I set the values with -rac/-ac ?

The throttling reason changes, though. Below is the output from another machine which also has two 1080 Ti. Only the first was being used at the time I ran nvidia-smi, note that it says P2 but there is an “Unknown” reason for throttling:

# nvidia-smi -q -d PERFORMANCE

==============NVSMI LOG==============

Timestamp                           : Tue May 30 21:43:48 2017
Driver Version                      : 381.22

Attached GPUs                       : 2
GPU 0000:03:00.0
    Performance State               : P2
    Clocks Throttle Reasons
        Idle                        : Not Active
        Applications Clocks Setting : Not Active
        SW Power Cap                : Not Active
        HW Slowdown                 : Not Active
        Sync Boost                  : Not Active
        Unknown                     : Active

GPU 0000:04:00.0
    Performance State               : P8
    Clocks Throttle Reasons
        Idle                        : Active
        Applications Clocks Setting : Not Active
        SW Power Cap                : Not Active
        HW Slowdown                 : Not Active
        Sync Boost                  : Not Active
        Unknown                     : Not Active

Here’s the full output of nvidia-msi -q:

# nvidia-smi -q

==============NVSMI LOG==============

Timestamp                           : Tue May 30 21:49:43 2017
Driver Version                      : 381.22

Attached GPUs                       : 2
GPU 0000:03:00.0
    Product Name                    : GeForce GTX 1080 Ti
    Product Brand                   : GeForce
    Display Mode                    : Disabled
    Display Active                  : Disabled
    Persistence Mode                : Disabled
    Accounting Mode                 : Disabled
    Accounting Mode Buffer Size     : 1920
    Driver Model
        Current                     : N/A
        Pending                     : N/A
    Serial Number                   : N/A
    GPU UUID                        : GPU-f8e10def-0245-e9e0-87c2-26b6bdf65206
    Minor Number                    : 0
    VBIOS Version                   : 86.02.39.00.2A
    MultiGPU Board                  : No
    Board ID                        : 0x300
    GPU Part Number                 : N/A
    Inforom Version
        Image Version               : G001.0000.01.04
        OEM Object                  : 1.1
        ECC Object                  : N/A
        Power Management Object     : N/A
    GPU Operation Mode
        Current                     : N/A
        Pending                     : N/A
    GPU Virtualization Mode
        Virtualization mode         : None
    PCI
        Bus                         : 0x03
        Device                      : 0x00
        Domain                      : 0x0000
        Device Id                   : 0x1B0610DE
        Bus Id                      : 0000:03:00.0
        Sub System Id               : 0x36091462
        GPU Link Info
            PCIe Generation
                Max                 : 3
                Current             : 3
            Link Width
                Max                 : 16x
                Current             : 16x
        Bridge Chip
            Type                    : N/A
            Firmware                : N/A
        Replays since reset         : 0
        Tx Throughput               : 32000 KB/s
        Rx Throughput               : 91000 KB/s
    Fan Speed                       : 58 %
    Performance State               : P2
    Clocks Throttle Reasons
        Idle                        : Not Active
        Applications Clocks Setting : Not Active
        SW Power Cap                : Not Active
        HW Slowdown                 : Not Active
        Sync Boost                  : Not Active
        Unknown                     : Active
    FB Memory Usage
        Total                       : 11172 MiB
        Used                        : 673 MiB
        Free                        : 10499 MiB
    BAR1 Memory Usage
        Total                       : 256 MiB
        Used                        : 2 MiB
        Free                        : 254 MiB
    Compute Mode                    : Default
    Utilization
        Gpu                         : 99 %
        Memory                      : 76 %
        Encoder                     : 0 %
        Decoder                     : 0 %
    Ecc Mode
        Current                     : N/A
        Pending                     : N/A
    ECC Errors
        Volatile
            Single Bit
                Device Memory       : N/A
                Register File       : N/A
                L1 Cache            : N/A
                L2 Cache            : N/A
                Texture Memory      : N/A
                Texture Shared      : N/A
                Total               : N/A
            Double Bit
                Device Memory       : N/A
                Register File       : N/A
                L1 Cache            : N/A
                L2 Cache            : N/A
                Texture Memory      : N/A
                Texture Shared      : N/A
                Total               : N/A
        Aggregate
            Single Bit
                Device Memory       : N/A
                Register File       : N/A
                L1 Cache            : N/A
                L2 Cache            : N/A
                Texture Memory      : N/A
                Texture Shared      : N/A
                Total               : N/A
            Double Bit
                Device Memory       : N/A
                Register File       : N/A
                L1 Cache            : N/A
                L2 Cache            : N/A
                Texture Memory      : N/A
                Texture Shared      : N/A
                Total               : N/A
    Retired Pages
        Single Bit ECC              : N/A
        Double Bit ECC              : N/A
        Pending                     : N/A
    Temperature
        GPU Current Temp            : 74 C
        GPU Shutdown Temp           : 96 C
        GPU Slowdown Temp           : 93 C
    Power Readings
        Power Management            : Supported
        Power Draw                  : 247.54 W
        Power Limit                 : 250.00 W
        Default Power Limit         : 250.00 W
        Enforced Power Limit        : 250.00 W
        Min Power Limit             : 125.00 W
        Max Power Limit             : 300.00 W
    Clocks
        Graphics                    : 1797 MHz
        SM                          : 1797 MHz
        Memory                      : 5005 MHz
        Video                       : 1607 MHz
    Applications Clocks
        Graphics                    : N/A
        Memory                      : N/A
    Default Applications Clocks
        Graphics                    : N/A
        Memory                      : N/A
    Max Clocks
        Graphics                    : 1936 MHz
        SM                          : 1936 MHz
        Memory                      : 5505 MHz
        Video                       : 1708 MHz
    Clock Policy
        Auto Boost                  : N/A
        Auto Boost Default          : N/A
    Processes
        Process ID                  : 5690
            Type                    : C
            Name                    : /root/zecminer/miner
            Used GPU Memory         : 663 MiB

GPU 0000:04:00.0
    Product Name                    : GeForce GTX 1080 Ti
    Product Brand                   : GeForce
    Display Mode                    : Disabled
    Display Active                  : Disabled
    Persistence Mode                : Disabled
    Accounting Mode                 : Disabled
    Accounting Mode Buffer Size     : 1920
    Driver Model
        Current                     : N/A
        Pending                     : N/A
    Serial Number                   : N/A
    GPU UUID                        : GPU-0e3dd1e9-e8c6-6bc6-6515-1221e0248cf4
    Minor Number                    : 1
    VBIOS Version                   : 86.02.39.00.9E
    MultiGPU Board                  : No
    Board ID                        : 0x400
    GPU Part Number                 : N/A
    Inforom Version
        Image Version               : G001.0000.01.04
        OEM Object                  : 1.1
        ECC Object                  : N/A
        Power Management Object     : N/A
    GPU Operation Mode
        Current                     : N/A
        Pending                     : N/A
    GPU Virtualization Mode
        Virtualization mode         : None
    PCI
        Bus                         : 0x04
        Device                      : 0x00
        Domain                      : 0x0000
        Device Id                   : 0x1B0610DE
        Bus Id                      : 0000:04:00.0
        Sub System Id               : 0x37511458
        GPU Link Info
            PCIe Generation
                Max                 : 3
                Current             : 1
            Link Width
                Max                 : 16x
                Current             : 16x
        Bridge Chip
            Type                    : N/A
            Firmware                : N/A
        Replays since reset         : 0
        Tx Throughput               : 0 KB/s
        Rx Throughput               : 0 KB/s
    Fan Speed                       : 0 %
    Performance State               : P8
    Clocks Throttle Reasons
        Idle                        : Active
        Applications Clocks Setting : Not Active
        SW Power Cap                : Not Active
        HW Slowdown                 : Not Active
        Sync Boost                  : Not Active
        Unknown                     : Not Active
    FB Memory Usage
        Total                       : 11172 MiB
        Used                        : 10 MiB
        Free                        : 11162 MiB
    BAR1 Memory Usage
        Total                       : 256 MiB
        Used                        : 2 MiB
        Free                        : 254 MiB
    Compute Mode                    : Default
    Utilization
        Gpu                         : 0 %
        Memory                      : 0 %
        Encoder                     : 0 %
        Decoder                     : 0 %
    Ecc Mode
        Current                     : N/A
        Pending                     : N/A
    ECC Errors
        Volatile
            Single Bit
                Device Memory       : N/A
                Register File       : N/A
                L1 Cache            : N/A
                L2 Cache            : N/A
                Texture Memory      : N/A
                Texture Shared      : N/A
                Total               : N/A
            Double Bit
                Device Memory       : N/A
                Register File       : N/A
                L1 Cache            : N/A
                L2 Cache            : N/A
                Texture Memory      : N/A
                Texture Shared      : N/A
                Total               : N/A
        Aggregate
            Single Bit
                Device Memory       : N/A
                Register File       : N/A
                L1 Cache            : N/A
                L2 Cache            : N/A
                Texture Memory      : N/A
                Texture Shared      : N/A
                Total               : N/A
            Double Bit
                Device Memory       : N/A
                Register File       : N/A
                L1 Cache            : N/A
                L2 Cache            : N/A
                Texture Memory      : N/A
                Texture Shared      : N/A
                Total               : N/A
    Retired Pages
        Single Bit ECC              : N/A
        Double Bit ECC              : N/A
        Pending                     : N/A
    Temperature
        GPU Current Temp            : 32 C
        GPU Shutdown Temp           : 96 C
        GPU Slowdown Temp           : 93 C
    Power Readings
        Power Management            : Supported
        Power Draw                  : 14.03 W
        Power Limit                 : 250.00 W
        Default Power Limit         : 250.00 W
        Enforced Power Limit        : 250.00 W
        Min Power Limit             : 125.00 W
        Max Power Limit             : 375.00 W
    Clocks
        Graphics                    : 278 MHz
        SM                          : 278 MHz
        Memory                      : 405 MHz
        Video                       : 544 MHz
    Applications Clocks
        Graphics                    : N/A
        Memory                      : N/A
    Default Applications Clocks
        Graphics                    : N/A
        Memory                      : N/A
    Max Clocks
        Graphics                    : 2037 MHz
        SM                          : 2037 MHz
        Memory                      : 5616 MHz
        Video                       : 1708 MHz
    Clock Policy
        Auto Boost                  : N/A
        Auto Boost Default          : N/A
    Processes                       : None

-ac doesn’t work, it says “Setting applications clocks is not supported for GPU 0000:03:00.0.”

Also, I can overclock the memory speed while it is running in P2 by doing “nvidia-settings -a [gpu:0]/GPUMemoryTransferRateOffset[3]=1000” … this raises the clock of both P2 and P3 by 1000MHz, bringing P2 to the same memory clock that P3 used to have by default. However, when my GPU application stops, the card then crashes and it also crashes the machine, because it briefly goes to P3 which has been overclocked by 1000MHz (which is too much).

Ok, forgot that -ac only works for Tesla/Quadro GPUs.
So this might be either a driver bug (tried earlier driver versions?) or depending on the workload (tried some simple CUDA tests?).

Only 375 works with 1080 Ti as far as I know (and it lists it as a generic card). I tried it only briefly and I seem to recall it did the same thing, though I wasn’t paying attention to this particular issue.

What would be a “simple CUDA test” that I could run for you to draw some conclusions? I presume you mean tests from the official CUDA samples? The workload I was testing was mining cryptocurrency (fairly intense, but not insane)

Any update on this issue? I use 1070 and 1060 and have the same issue. My cards always stay in P2 state and I can’t set higher clocks with nvidia-smi -ac option. I use PCIe x1 to x16 extender cables to 6 GPUs in one motherboard, could this cause the issue?

sudo nvidia-smi -q -d SUPPORTED_CLOCKS:

==============NVSMI LOG==============

Timestamp                           : Wed Jun  7 16:20:43 2017
Driver Version                      : 381.22

Attached GPUs                       : 6
GPU 0000:01:00.0
    Supported Clocks                : N/A

GPU 0000:02:00.0
    Supported Clocks                : N/A

GPU 0000:03:00.0
    Supported Clocks                : N/A

GPU 0000:04:00.0
    Supported Clocks                : N/A

GPU 0000:05:00.0
    Supported Clocks                : N/A

GPU 0000:06:00.0
    Supported Clocks                : N/A

sudo nvidia-smi -ac 4000,1900
Setting applications clocks is not supported for GPU

My driver version is 381.22 on Ubuntu 16.04.2 LTS. I don’t use an X-server, because it is a headless machine.

>>every time the card starts doing some intense GPU work, it goes to performance level 2 and stays there and never goes to performance level 3.

May that app didn’t need performance level 3 . Why do you think its stuck? Try changing setting via nvidia-settings .

@sandpit, the card behaves the same for all GPU apps (3D benchmarks, crypto miners, etc). Whether the app needs P3 or not is beyond the point. I want to force the card into P3 at all times regardless of what it’s doing, and I don’t see why I should be prevented to do that.

As I explained, the card does go to P3 sporadically when it’s idle, so it definitely can reach P3, but when the app starts, it goes to P2 and never changes state for as long as the app is running.

Changing via nvidia-settings does not affect the result. It allows me to set “performance mode” but the card remains in P2.

Read this:


Though you won’t like it.

I have tested my cards also with Windows 10 and the cards stay also at P2 state, but I can overclock the memory frequency and lower voltage with 3rd party tools like MSI-Afterburner. Overclocking is not possible with Ubuntu. :(

@genrix, well, that sucks. I had a feeling that it’s a limitation of the 1080. I suspect hardware and not drivers, because it’s the same in windows … I can overclock P2 to the same clocks as P3, but then as soon as my 3D app closes, the card crashes and takes down the OS with it (this is because you can’t overclock just P2; it automatically increases clocks for the other states too, which go above the limits and then chokes when switching state).

You’d think Nvidia had done a proper job under Linux after all the criticism … still such a long way to go. Right now I have to have X installed and also a running X and display or else nvidia-settings doesn’t run (needs Mir), i.e. I can’t overclock the card. So annoying.

@bans3i, overclocking works fine in Ubuntu. You need to enable coolbits and have a monitor connected (or trick it to think there is a monitor connected via xorg.com). Try a google.

The only workaround I can think of is to set fixed frequency to 0x2 (should be P2?) and then overclock. And like bans3i said and is mentioned in the link it has nothing to do with Linux, it’s the same as in Windows. Looks like it’s by design. Maybe because the consumer cards are not sold for cuda workloads wink.

Thank you. I had to fake a monitor, now it works. :)

If that would be true, there would be no reason to ever buy a NVidia card again, and rather use AMD with OpenCL. Why would you buy a crippled product? That would be like buying a Mustang, but you can only use the first 4 gears if you are not on a highway.

I said that, but if you read closely you’d see that the 1070 does not have this problem. Only 1080 and 1080Ti. That’s the bummer and the disappointment.

I am also having this problem. The application I am using runs in P2 and I am able to offset both the clock rate and memory transfer rate using nvidia-smi. This offsets the clock rates for P2 and P3 by the same amount. It order to optimise my application performance I have to offset by a significant amount as I cannot force it to run in P3. However, when I stop the application (and the system is idle) for some reason it seems to periodically cycle between P0-P3. When it hits P3 the offset rates are too high and it causes the system to crash.

Interestingly - it only seems to be the graphic card I have attached to my monitor that cycles through P0-P3 when idle. The other two cards to not appear to cycle and remain in P0.

I either need a way to only apply the offset to P2 or I need a way to force the application to run in P3. Please advise.

Many thanks,

Sam

Actually I discovered that the same problem exist in Windows and also for both GTX 1080 Ti and GTX 1070 (I tested 15 different cards) … it’s either that the new Pascal architecture no longer allows forcing the high performance state, or it’s driver bug across all OSs. The crypto-mining community has discovered this aspect a while ago.

Forcing the P state was working just fine in the GTX 9xx series. Why did Nvidia change this?

Essentially, Nvidia currently forces us to run the cards in a semi-crippled state. You can try overclock the P2 clock offsets to give the same clocks as the ones P3 has (which are usually about 1000 MHz higher for 1080 Ti), but you must do it while in P2 and it’s then almost guaranteed to crash once it exits P2. That is because you can’t change only the P2 clock offsets; when you change the P2 clock offsets, the P3 ones change as well by the same amount, going beyond the limit of P3; so when the card exits P2 (e.g. every time your 3D application finishes or has a lower load) then it will cycle out of P2 and at some point will get to P3 and then crash.

I was never able to control that behavior. Neither under Linux, nor under Windows. I would also be happy if I could tell it to never go to P3, and then keep P2 overclocked. But that’s not possible either.

It seems the 10xx cards are always in adaptive mode, regardless of you specifying “performance mode” in nvidia-settings. It cycles P states according to some unknown algorithm, e.g. it goes to P3 sporadically when semi-idle, but never when fully loaded.

How can we get an official response to this from Nvidia, to see if it’s a bug or intended behavior, or if there is any way to achieve what we want?

p.s. The nomenclature is also confusing, because most refer to the performance state as P0, whereas at least in Linux that’s labeled P3.