[Help Needed] How to Enable Grace CPU Power Telemetry on DGX Spark (GB10)?

Hi everyone,
I’m working with a DGX Spark (Grace GB10) system and I’m trying to access Grace CPU power telemetry, but so far I haven’t been able to find any of the interfaces described in the official Grace Performance Tuning Guide.

Below is a summary of what I’ve verified.
. No ACPI Power Meter / hwmon power sensors

The guide mentions interfaces such as:

  • power1_input

  • power1_average

  • power1_oem_info

However, on DGX Spark, the hwmon device only exposes thermal‑related fields:

available_policies
emul_temp
integral_cutoff
k_d
k_i
k_po
k_pu
mode
offset
policy
power
slope
sustainable_power
temp
trip_point_0_*

There are no power meter attributes under any hwmon device.

2. No SCMI Power Domain

The directory:

/sys/class/scmi_power/

does not exist or is empty, suggesting SCMI power telemetry is disabled on Spark.

3. ACPI Power Meter appears disabled

The documentation states that Grace CPU power telemetry requires:

  • CONFIG_SENSORS_ACPI_POWER

  • Kernel parameter: acpi_power_meter.force_cap_on=y

But DGX Spark’s kernel does not seem to expose any ACPI power meter device.

4. DCGM / NVML only report GPU power

nvidia-smi and DCGM provide GPU power draw, but no Grace CPU power information.

5. DGX Spark has no BMC

Since Spark lacks a BMC, Redfish/IPMI cannot be used to retrieve CPU rail power either.

My Questions

I would like to confirm the following:

1. Is Grace CPU power telemetry completely disabled on DGX Spark hardware?

2. Is there any kernel configuration, driver module, or DGX OS package that can enable it?

3. Are there hidden sysfs interfaces, NVML/DCGM extensions, or experimental features that expose CPU rail power?

4. Will future DGX OS releases provide Grace CPU power telemetry for Spark?

5. If telemetry is intentionally unavailable, is there an NVIDIA‑recommended alternative for estimating CPU power?

2 Likes

This might be an impressively stupid question, but would it be the case that we get power readings from both GPU and CPU through the nvidia-smi output for the GB10 chip?

The Spark power management is different than the 72 core Grace CPU. There is no method to monitor CPU power and currently no plans to expose CPU rail information.

The power information from nvidia-smi is only for the GPU.

1 Like

Hi,

I vibe coded a kernel driver for this.

Turns out there is a mediatek chipset on sparks collecting this information

I just made this so I am still testing. Give it a go.

7 Likes

Your answer (marked as Solution) was followed by replies by other users, including code for a functional driver, and tests by another user. Those messages seems to contradict and create confusion, at least for me.

@aniculescu, could you please clarify what you meant by “There is no method to monitor CPU power”?

The community to the rescue, as usual :)

Thank you @antheas !

This is awesome and gives me hope of us eventually figuring out how to access voltage / clock / fan controls on these boxes.

1 Like

Hi,
clock control for the GPU is already available

E.g., sudo nvidia-smi -lgc 208,2418

I updated the kernel driver to show fan speeds, cleaned up the wattages to only ones that make sense, and added pl1/pl2/syspl1/syspl2 controls to allow controlling the thermal envelope.

AFAIK the interface does not expose fan controls. It does expose controls to limit CPU E and P core thermal envelope individually, but they seem to be non-functional for now. I’m not sure about fans.

In addition, the _DSM value of 12345678-1234-1234-1234-123456789abc suggests that this interface is still under development. So Mediatek will need to polish first.

But it did save my bacon, I am back on track for my deadline, and the exposed ABI for the kernel driver I think reaches the point where it is upstreamable to hwmon as is. It uses the same conventions as other drivers. Of course, it will need rewriting.

3 Likes