Reducing power consumption at idle

I have a system with a Q6600, P6N Diamond MB, 4GB RAM, and 2 8800GTX’s. Power consumption just after boot at idle is around 230W. Consumption increases (as expected) when running a CUDA program, but does not decrease to idle levels after the program completes - it hovers around 330W until rebooting.

I’ve found this happens with the first execution of any CUDA program after booting. Simply loading the driver module does not increase power consumption.

Our product may spend a significant amount of time powered on and idle, so I would certainly like to reduce power consumption as much as I can. Is there a way I can “clean up” after executing to help?

I noticed something similar to this when I was testing the power consumption of our 3-GPU boxes here, but I neglected to follow up on it so I don’t presently have any data to add to yours. As we are very keen on building GPUs into all of our next-gen compute servers and clusters, I too would love to know
how to get the GPUs to throttle down to negligable power consumption when no jobs are running and X11 is shutdown.

Cheers,
John Stone

Can you comment on how you’re performing these power measurements?

We do ours using off the shelf “Kill-a-watt” power measurement devices:

http://www.p3international.com/products/sp…0/P4400-CE.html

We use these devices when testing hardware for clusters, to measure real-world power consumption when systems are idle, under load, etc. The device will also show you what power factor the power supply is achieving (a good PSU achieves a value close to 1.0):

http://en.wikipedia.org/wiki/Power_factor

I assume that this is how others are measuring their power consumption as well.

I’ll try to find time to do an idle power test after a fresh boot and see if I can reproduce the result I’d seen before with the CUDA 1.0 version, as I haven’t done this test for a while, probably since 0.9 or thereabouts.

Cheers,

John

I am measuring inline with the power cord to the system’s power supply, using the same Kill-A-Watt device John mentioned. I am using CUDA 1.0.

That’s interesting. I’m using a remote power switch, which provides assorted electrical readings (including wattage), and I’m not able to replicate this problem. I see 300W used while idle, after booting, yet before starting X. After stopping X, the power consumption drops back down again to 300W. I then ran a few different CUDA apps, saw the power consumption spike up to nearly 400W, and then drop back down to around 330W afterwards (without X running). So, while I am seeing a bit of an increase, its not anywhere near the 100W that was originally reported.

I guess there could be other factors here, such as the model/capacity of power supply, motherboard, etc which could all be playing a role. My concern, however, is whether we’re all seeing the same issue.

I’d like to see a more detailed breakdown of the power consumption during different phases of testing:

  • POST
  • Boot (before starting X)
  • After starting X (otherwise idle)
  • After stopping X (otherwise idle)
  • Running CUDA app(s) (without X running)
  • Idle (without X running)

It would also be useful to know what kind of system(s) and/or power supplies (and their capacity) are being used.

thanks,
Lonni

I can make those measurements for you today.

188W - Immediately after power on
274W - GPU POST screen displayed
265W - BIOS POST screen displayed
265W - GRUB menu displayed
240W-250W - Kernel and runlevel init
240W - Idle, at login prompt
240W - Idle, logged in
285W peak - While starting X
275W - Idle, X running
322W - Idle, after killing X
355W peak - Running my CUDA application on one card
325W - Idle

FYI, I don’t run this way. I run “modprobe nvidia” to load the driver then run a simple CUDA program as root to get the /dev/nvidia* nodes created. My original measurements were taken running like this, but there doesn’t appear to be much difference in power consumption.

The power supply is an Ultra X3 1600W. Other system specs are in my original post.

Jim

With 3 GPUs, one of our test systems has the following power consumption profile…

330W - From shortly after power on until the kernel has loaded, daemons etc, prior to starting X…
490W - While X is starting up
417W - Idle, X running
417W - Idle, after killing X (init 3)
700W peak - Running my CUDA application on three GPUs in parallel
500W - Idle

I can run more tests the next time the system is idle. I only had a brief opportunity to run this test as the system had to be rebooted for kernel updates etc anyway…
I can dig up the PSU info for a later note…

Cheers,
John Stone

We have a quad cpu 2 gpu system. Were having a hard time getting repeatable timing and I noticed a power delta just like you had. Finally noticed that cpuspeed was trying to help us reduce power when we weren’t running anything. I’m guessing that from bootup, cpuspeed senses a small cpu load and backs the cpu clock speed way back. Once a gpu ap is run, it sees a slightly higher cpu load and never backs the cpu clocks back again. Try turning off cpuspeed and see if it gets any better.

Don

Thanks for the tip. Disabling cpuspeed didn’t help in my case.

Testing with a D870 on the same system shows similar behavior. Here are the power measurements for just the D870:

81W - At idle; driver loaded; before any CUDA applications have been run
300W - While running my CUDA application
214W - After CUDA application exits

Jim,

Have you filed an official bug on this yet?

We just got a D870 that we’ll be testing shortly. If you’ve already filed a bug then I won’t try and reproduce this, but if not, I’ll see if I can cause the same behavior on our system.

Cheers,

John Stone

I have not submitted a bug, but I think I will now that you’ve mentioned it. In any case, would you mind running the test on your D870 and posting your results?

just an idea, but if u run river tuner and look at the overclocking sectin, you may be able to lower the clock speed on the core, shaders and memory when the card is in 2d mode, so when the application ends the card clocks down… just a thought. good luck.

Thanks for the suggestion, but I believe rivatuner is a Windows application only.

I’m quite sure there is an nvclock or so app for Linux

sorry didnt see the linux part (duh) but yeah, im sure there will be a linux variant or some such program allowing you to clock down.

Thanks for the nvclock suggestion. I’ll have a look at it. Maybe it will help until we can get a fix.

I’ve just submitted a bug on this. John, I can include your results if you like; just give me your system specs.

Jim

I am working on ASUS FX504 GTX 1050. AFTER INSTALLING CUDA10.1 I found out that my laptop was draining very fast.
Is it relatable to the above cpntent.
I also uninstalled CUDA still the same problem persists. Do you have any solution for this