Nvidia-powerd fails to start on 4060

Hello, I recently purchased a asus ga403uv with rtx4060, however under linux nvidia-powerd always fails once I start it with the following error message:

4月 05 10:46:34 Zephyrus /usr/bin/nvidia-powerd[605]: Error in getting the NvPCF static configuration
4月 05 10:46:34 Zephyrus /usr/bin/nvidia-powerd[605]: Failed to initialize Dynamic Boost
4月 05 10:46:34 Zephyrus /usr/bin/nvidia-powerd[605]: Failed to detach GPU id 256
4月 05 10:46:34 Zephyrus /usr/bin/nvidia-powerd[605]: Failed to initialize Dynamic Boost
4月 05 10:46:34 Zephyrus /usr/bin/nvidia-powerd[605]: Failed to detach GPU id 256

I’ve tested this under different setup, stable kernel, rc kernel, proprietary driver, nvidia-open driver, different distributions, all the same

nvidia-bug-report.log.gz (1.6 MB)

The error message rather points to a bios issue.

Thank you for your help! But I’ve contacted asus consumer service, and they’re unwilling to provide linux support, what should I do?

Did you check if this works on Windows?

It does work on windows. The maximum power consumption of gpu goes up to 90 watts under windows, which is the machine’s maximum quota for gpu, while under linux I only get 55watts for gpu at maximum

Then you won’t get any help from asus. Does /sys/firmware/acpi/platform_profile exist on your notebook?

Yes

> $ ls /sys/firmware/acpi/                                                     
bgrt  hotplug     platform_profile          pm_profile
fpdt  interrupts  platform_profile_choices  tables

The BIOS of this laptop is filled with bugs, even the CPPC feature of the CPU won’t work under linux because of asus’s silly BIOS

Please check
cat /sys/firmware/acpi/platform_profile_choices
for the available modes, “performance” should be one of it. Install power-profiles-daemon and use it to switch to performance mode. Check nvidia-smi -q -d POWER for increased power limit, then check if nvidia-powerd can be started.

Still won’t work

> $ cat /sys/firmware/acpi/platform_profile                                    
performance
                                                                                
> $ nvidia-smi -q -d POWER                                                     

==============NVSMI LOG==============

Timestamp                                 : Tue Apr 23 10:19:35 2024
Driver Version                            : 550.76
CUDA Version                              : 12.4

Attached GPUs                             : 1
GPU 00000000:01:00.0
    GPU Power Readings
        Power Draw                        : 10.48 W
        Current Power Limit               : 55.00 W
        Requested Power Limit             : 55.00 W
        Default Power Limit               : 55.00 W
        Min Power Limit                   : 5.00 W
        Max Power Limit                   : 90.00 W
    Power Samples
        Duration                          : Not Found
        Number of Samples                 : Not Found
        Max                               : Not Found
        Min                               : Not Found
        Avg                               : Not Found
    GPU Memory Power Readings 
        Power Draw                        : N/A
    Module Power Readings
        Power Draw                        : N/A
        Current Power Limit               : N/A
        Requested Power Limit             : N/A
        Default Power Limit               : N/A
        Min Power Limit                   : N/A
        Max Power Limit                   : N/A

                                                                                
> $ sudo systemctl start nvidia-powerd                                                                                                                                           
Job for nvidia-powerd.service failed because the control process exited with error code.
See "systemctl status nvidia-powerd.service" and "journalctl -xeu nvidia-powerd.service" for details.
                                                                                                                                                                                  

> $ systemctl status nvidia-powerd                                                                                                                                               
× nvidia-powerd.service - nvidia-powerd service
     Loaded: loaded (/etc/systemd/system/nvidia-powerd.service; enabled; preset: enabled)
     Active: failed (Result: exit-code) since Tue 2024-04-23 10:20:19 CST; 10s ago
    Process: 3791 ExecStart=/nix/store/gnc111yw4pk6w4gj42fafpcwb0d8bl7a-nvidia-x11-550.76-6.9.0-rc5-bin/bin/nvidia-powerd (code=exited, status=1/FAILURE)
   Main PID: 3791 (code=exited, status=1/FAILURE)
         IP: 0B in, 0B out
        CPU: 3ms

4月 23 10:20:19 Zephyrus systemd[1]: Starting nvidia-powerd service...
4月 23 10:20:19 Zephyrus /nix/store/gnc111yw4pk6w4gj42fafpcwb0d8bl7a-nvidia-x11-550.76-6.9.0-rc5-bin/bin/nvidia-powerd[3791]: nvidia-powerd version:1.0(build 1)
4月 23 10:20:19 Zephyrus /nix/store/gnc111yw4pk6w4gj42fafpcwb0d8bl7a-nvidia-x11-550.76-6.9.0-rc5-bin/bin/nvidia-powerd[3791]: Error in getting the NvPCF static configuration
4月 23 10:20:19 Zephyrus /nix/store/gnc111yw4pk6w4gj42fafpcwb0d8bl7a-nvidia-x11-550.76-6.9.0-rc5-bin/bin/nvidia-powerd[3791]: Failed to initialize Dynamic Boost
4月 23 10:20:19 Zephyrus /nix/store/gnc111yw4pk6w4gj42fafpcwb0d8bl7a-nvidia-x11-550.76-6.9.0-rc5-bin/bin/nvidia-powerd[3791]: Failed to detach GPU id 256
4月 23 10:20:19 Zephyrus /nix/store/gnc111yw4pk6w4gj42fafpcwb0d8bl7a-nvidia-x11-550.76-6.9.0-rc5-bin/bin/nvidia-powerd[3791]: Failed to initialize Dynamic Boost
4月 23 10:20:19 Zephyrus /nix/store/gnc111yw4pk6w4gj42fafpcwb0d8bl7a-nvidia-x11-550.76-6.9.0-rc5-bin/bin/nvidia-powerd[3791]: Failed to detach GPU id 256
4月 23 10:20:19 Zephyrus systemd[1]: nvidia-powerd.service: Main process exited, code=exited, status=1/FAILURE
4月 23 10:20:19 Zephyrus systemd[1]: nvidia-powerd.service: Failed with result 'exit-code'.
4月 23 10:20:19 Zephyrus systemd[1]: Failed to start nvidia-powerd service.

That looks even worse, the profile selection also doesn’t work. The power limit should have increased to 65W with it. Any ACPI errors in dmesg while switching profiles?

No, there’s no ACPI error while switching profile, I get zero new message in dmesg during switching profile

dmesg.txt (126.3 KB)

However there are several ACPI errors during boot and after resuming from sleep, but I don’t know if they’re related to this issue.

Unlikely.
Please also check with the closed driver.

Nope, closed driver doesn’t work either
dmesg.txt (101.6 KB)

> $ cat /sys/firmware/acpi/platform_profile                                                                                                                                      
performance
                                                                                                                                                                                  
> $ nvidia-smi -q -d POWER                                                                                                                                                       

==============NVSMI LOG==============

Timestamp                                 : Tue Apr 23 16:46:51 2024
Driver Version                            : 550.67
CUDA Version                              : 12.4

Attached GPUs                             : 1
GPU 00000000:01:00.0
    GPU Power Readings
        Power Draw                        : 1.31 W
        Current Power Limit               : 55.00 W
        Requested Power Limit             : 55.00 W
        Default Power Limit               : 55.00 W
        Min Power Limit                   : 5.00 W
        Max Power Limit                   : 90.00 W
    Power Samples
        Duration                          : Not Found
        Number of Samples                 : Not Found
        Max                               : Not Found
        Min                               : Not Found
        Avg                               : Not Found
    GPU Memory Power Readings 
        Power Draw                        : N/A
    Module Power Readings
        Power Draw                        : N/A
        Current Power Limit               : N/A
        Requested Power Limit             : N/A
        Default Power Limit               : N/A
        Min Power Limit                   : N/A
        Max Power Limit                   : N/A


Since even the profile selection doesn’t seem to work, you might want to check with the asus-wmi dev:
https://gitlab.com/groups/asus-linux/-/issues

Thank you for your advice, I’ll take a look at it. But how is power profile selection related to the power cap of GPU? isn’t it managed by nvidia-powerd exclusively?

No. It’s a combination of both. First, there’s the base power limit. This can usually be modified by the platform profile, in your case “should” be steps of 45, 55, 65W on quiet, balanced, performance. Additionally, there’s the “dynamic boost”, controlled by nvidia-powerd. On your system, it has a budget of 25W. If the cpu doesn’t use its full power budget, nvidia-powerd can shift it over to the gpu, raising the power limit above the base limit. In your case, nothing works at all.

Thank you for your detailed explanation! So I guess I can only wait for asus’s BIOS fix?

That’ll be long, endless wait, I guess. Of course, once asus releases an update, you should check. Rather contact the asus-wmi dev.

OK I’ll contact them, thank you again for your patient reply