nvmlDeviceGetMinMaxClockOfPState/nvmlDeviceSetClockOffsets issues

NVML v555 introduced nvmlDeviceSetClockOffsets and nvmlDeviceGetMinMaxClockOfPState, supposedly rendering old functions for managing clock offsets obsolete. However, I did not have much luck using them:

  • nvmlDeviceSetClockOffsets sets offsets for every pstate - for example, if I’ll set a 500 MHz offset for pstate 0 and then 200 MHz offset for some other pstate, pstate 0 will obey the most 200 MHz restriction, even if it was not meant for it.
  • nvmlDeviceGetMinMaxClockOfPState always reports current clock offset as 0 (for all pstates), even if it was previously set by nvmlDeviceSetClockOffsets, which is a pretty major issue for an API supposed to replace old functionality. Min/max offsets are returned as expected.

Tested on Linux with GTX 1650 Super and RTX 4060 Laptop, running open source drivers 565.77.

Are per-pstate clock offsets reserved for enterprise-grade hardware or is this just a bug? If it’s the former, can we please at least get a working nvmlDeviceGetMinMaxClockOfPState (with offset being reported for all pstates, I guess) and a way to query if offsets are per-pstate or not?

P.S.: not very hopeful for a reply given the general inactivity on this forum, but worth a shot nonetheless.
P.P.S: Happy new year!

I’m having quite a few issues with the new clockoffset nvml stuff as well. Here is my very basic python script which I am using:

#!/usr/bin/env python

import sys
from pynvml import *
import argparse

# Initialize NVML
nvmlInit()

# Setup command line argument parsing
parser = argparse.ArgumentParser(description='Set GPU clock offsets and power limit.')
parser.add_argument('max_clock', type=int, help='Maximum GPU clock in MHz')
parser.add_argument('core_offset', type=int, help='Core clock offset in MHz')
parser.add_argument('memory_offset', type=int, help='Memory clock offset in MHz')
parser.add_argument('power_limit', type=int, help='Power limit in watts')

# Parse the arguments
args = parser.parse_args()

# GPU handle
handle = nvmlDeviceGetHandleByIndex(0)

# Clock types
CLOCK_TYPES = {
    'core': NVML_CLOCK_GRAPHICS,
    'memory': NVML_CLOCK_MEM,
}

# Function to set clock offset
def set_offsets(handle, clock_type, clock_offset):
    struct = c_nvmlClockOffset_t()
    struct.version = nvmlClockOffset_v1
    struct.type = clock_type
    struct.pstate = 0  # only affect pstate 0
    struct.clockOffsetMHz = clock_offset
    nvmlDeviceSetClockOffsets(handle, struct)

# Lock the GPU clock range
nvmlDeviceSetGpuLockedClocks(handle, 0, args.max_clock)

# Set the core and memory offsets
set_offsets(handle, CLOCK_TYPES['core'], args.core_offset)  # Core offset
set_offsets(handle, CLOCK_TYPES['memory'], args.memory_offset * 2)  # Memory offset

# Set power management limit
nvmlDeviceSetPowerManagementLimit(handle, args.power_limit * 1000)  # Convert to milliwatts

# Shutdown NVML
nvmlShutdown()

it is able to set the power limit and set the max core clock to whatever i set as max_clock, however completely fails to set pstate 0 clock offsets both for core and memory.

I think nvidia has broken something in the recent releases.

Nvidia 570 series fixed the nvmlDeviceGetMinMaxClockOfPState issue (probably related to nvidia-setting switching to NVML). So that’s one issue down. nvmlDeviceSetClockOffsets still affects all pstates, though.

Because there is no way to overclock any performance state but 0*. The API is a lie.

1 Like