Ubuntu 20.04 - NVIDIA GPU consuming power even when using only integrated graphics card (Intel iGPU)

My guess is that runtime suspend works with the nouveau driver but for some unknown reason doesn’t work with the nvidia driver (always showing ‘active’). AFAIK, Ubuntu only installs the file to enable runtime pm together with the nvidia driver while pop_os uses the system76-powerd additionally.
Doesn’t explain why this fails with the nvidia driver, though.

So I tried installing once again POP_OS without the Nvidia driver and everything worked normally with low power consumption on idle. As soon as I tried installing the Nvidia driver OR updating the OS, the problem came back. My guess is that the os update changes some default stuff related to nvidia too even if it is not installed.

Do you think I can somehow report this bug to nvidia? I am pretty sure it is not an issue on my side at this point.

For now I have restored to using the dGPU just on Windows but it’s not really ideal.

This rather sounds like something broke in the kernel which is responsible for runtime pm in the first place. Which version works for you?

I have the same/similar problem.

After switching to Intel, I have higher power usage than before (about +12W).

Looks like the nvidia chip is powered on fully while on AC, for no reason.

Intel iGPU:

49W* / GPU temp 62 C

Nvidia dGPU, peformance mode:

55W* / 62 C

Nvidia, on demand

38W* / GPU temp 32

  • power draw from socket

It only happens while connected to AC !!!

on battery, it works as expected:

iGpu:

The battery reports a discharge rate of 7.7 W - powertop

Looks like dGPU is deactivated

dGPU mode, power usage is expected to be higher:

The battery reports a discharge rate of 20.2 W - powertop

Laptop is InfinityBook Pro 16 - Gen7 - Max Performance Edition - TUXEDO InfinityBook Pro 16 - Gen7 - Max Performance Edition - TUXEDO Computers

with RTX-3060

is there a way to switch off tha dGPU while on AC? to save the planet :)

The “intel” mode shouldn’t be used on modern laptops supporting runtime pm. It doesn’t really do what it used to do. Always use “on-demand” mode.
In that mode, there shouldn’t be a difference between AC/battery, the nvidia gpu will be powered down when unused.
Since you’re measuring on the socket, please make sure the battery is fully charged so it doesn’t draw additional power.
Use

cat /sys/bus/pci/devices/0000:01:00.0/power/runtime_status

to check the nvidia gpu is really powered down.

I made sure battery was fully charged and all apps closed, same brightness…

“The “intel” mode shouldn’t be used”
well it gives 2x the battery.

In Intel mode on Battery:
$ cat /sys/bus/pci/devices/0000:01:00.0/power/runtime_status
suspended
GPU temp 32

In Intel mode on AC:
$ cat /sys/bus/pci/devices/0000:01:00.0/power/runtime_status
active
GPU temp jump to 60C

so when connecting the AC, dGPU is powered on for no reason.

Cheers

Please run nvidia-bug-report.sh as root in on-demand mode and attach the resulting nvidia-bug-report.log.gz file to your post.

nvidia-bug-report.log (1.2 MB)

attached.

there are other issues as well.

  1. sometimes the external screen , complains about incorrect mode.
  2. and sometime laptop hangs 10 seconds when disconnecting usb-c hub, with monitor attached

all that just happened before 1 ran the bugreport.
hasn’t happened yet in Intel only.

cheers

ps: edited, reached my topic limit of 3 :(

ps2: 3d seems fine, used Unigine_Superposition-1.1.run and got 9985 score, about 75-90FPS.
will try latest kernel later.

ps3: same thing happen with 5.15

The log upload is stuck. Please try unzipping it.

1 Like

There’s simply something broken, the nvidia driver starts up fine but after 3 minutes, the gpu crashes (Xid 79) and is inaccessible so can’t be turned off and draws power. Didn’t you notice you couldn’t use the nvidia gpu at all?
Please downgrade to driver 520 and check if that resolves it, if not please check for a bios update or contact Tuxedo.

Furthermore, since this is a brand new 12th gen cpu, please use the liquorix ppa to upgrade to kernel 6.0 first.

1 Like

upgraded from ubuntu 22.04 to 22.10 with kernel 5.19… (wasn’t able to get nvidia working with liquorix kernel)

everything seems to be working.
On demand getting about 7W on battery
Suspend to RAM works.

also on AC CPU/GPU is much cooler. about 20C cooler. dropped from 60 to 40.

After a while PM broke again…

turns out that the gnome-extenstion to read GPU temp, was waking up the GPU all the time. As soon as I removed GPU temp, power usage dropped from 20W to 7-8W

is this expected behavior? or a bug somewhere?

This is expected. Those monitoring apps/plugins are all made for desktops where the nvidia gpu is primary and always on. Don’t use it on a hybrid graphics notebook. You can’t ask the gpu something without waking it up.

1 Like

nvidia-bug-report.log (1.2 MB)

sorry me again…
looks like using on-demand + using AC. turns on the GPU for no reason:

On AC:
cat /sys/bus/pci/devices/0000:01:00.0/power/runtime_status
active

Unplug cable, same apps running.
cat /sys/bus/pci/devices/0000:01:00.0/power/runtime_status
suspended

plug cable back:
cat /sys/bus/pci/devices/0000:01:00.0/power/runtime_status
active

this is just producing a lot of unnecessary heat. GPU is running at +20C higher, also CPU is hotter because of shared heat sink.

is there any way to fix this? is this a BIOS or nvidia bug or ubuntu 22.10 bug?

Only thing I noticed in the logs is that runtime pm isn’t enabled for the audio device, so this might keep the gpu awake.
Please set it to auto
sudo sh -c "echo auto > /sys/bus/pci/devices/0000:01:00.1/power/control"
and make sure it sticks
cat /sys/bus/pci/devices/0000:01:00.1/power/control
Then check the behaviour on AC again.

i tried

sudo sh -c “echo auto > /sys/bus/pci/devices/0000:01:00.0/power/control”
sudo sh -c “echo auto > /sys/bus/pci/devices/0000:01:00.1/power/control”
and got:

silles@tuxedo:~$ cat /sys/bus/pci/devices/0000:01:00.?/power/runtime_status
suspended
suspended

but makes no difference to power draw and temp doesn’t drop as it does while unplug.

this suggests me that something is still on…

Please check if this works:
https://forums.developer.nvidia.com/t/incorrect-power-management-with-prime-configuration/157927/25?u=generix

i tried
DynamicPowerManagement: 1
DynamicPowerManagementVideoMemoryThreshold: 200

this was the default
DynamicPowerManagement: 2
DynamicPowerManagementVideoMemoryThreshold: 200

no luck

The linked post has other values.