A driver to disabled & re-enable Cuda cores..Possible? Just some idea that popped up in my brain

Hi all,

Just got this thought that came to mind, what if NVIDIA would make a driver that will allow you to enable and disable Cuda cores, which could A.) save power and B/) also save over heating of the GPU doing certain workload amounts.

Just say you are doing some basic 2D work, and your GPU has a total of 512 Cuda cores, you could then have a setting between 16 and 512, so for this you could go for 16 of the 512 Cuda cores for this app and turn all the other ones off.

if you are doing some what heavier work like a 3D app from 2005 a few years back you could enable 256 of the 512 Cuda cores to do this workload, hereby having enough rendering capacity for the workload and saving power by disabling the others, and when it’s really needed you can just enable all 512 and go onwards, maybe something like this should have a manual and an Auto mode and an off mode hereby using the full 512 Cuda cores.

Let me know of your thoughts.

Peace.

An interesting idea, though I don’t know what changes to the hardware would be required.

Current chips can change their clock rates which can be just as effective a power saving mechanism. At the moment, it seems to be used mostly to reduce idle power. I am not sure if any of the drivers give you the ability to force the card to stay at a low clock rate, even while in use. (Except for the occasional driver bug that does this! :) )

hmm well I thought it could be an software issue, since we use drivers to enable and disable SLI or FSAA and such hardware mattic things and like we all know a geForce GPU since the Geforce 256 aka NV10 are programable so I thought well maybe it could be beprogrammed so you can trun Cuda cores on and off manually and via some auto setting, by disabling this the card remains to be used with all Cuda cores, I guess it’s ratehr complicated, but afaik a GTX 465 uses lesser power than a GTX 480, this is because it has much lesser cuda cores so I though okay that makes seense, lesser cores that need power to work.

Si if we had a driver that could let us disable and enable Cuda cores for certain apps, we could save energy this way as well hehe, I suppose somethings are very hard to implent, never knew that these Cuda cores don’t depend on software like SLI and oother tech specs like FSAA, and AF would.

But yeah that maight mean that the entire design might have to be modied to allow this to be possible hmm… it will eb something worth adding I’m like and yes I’d be willing to test this.

It’s a good idea that would work, but the interesting question is whether it’d be worth the complexity.

It’d require hardware level changes… since if different SMs have different clock rates (idle versus full power) they would need to separate their own signal lines into a new chip level domain. Merely disabling the cores sounds good too but that means you also need independent POWER domains on the chip… and the ability for signal lanes to deal with neighbors which may or may not be active. It gets complex.

However, it’s not impossible and in the latest i7 chips from Intel do many of these tricks with special power gating per core.

GPUs will likely do more clock/power gating in the future too. Likely the priority will stay chip-wide for a long while (not per-SM), since it’s much more important to control idle power than it is to try to fine-grain optimize partial shutdowns. And in fact the parallelism of GPU tasks means that it’s almost always easier to use the whole chip for a shorter time than it is to dynamically shut down variable numbers of SMs. On the CPU, threads can’t be broken up and shared among cores… GPU tasks are fundamentally designed to massively parallellize.

We can see that the GPUs already do shut down part of themselves when running CUDA apps… likely the rasterizers and graphics-specific hardware is quiescent even when the SMs and memory are cranking at full clock rate running CUDA programs. This hypthesis is strongly supported by measuring GPU wattage during heavy graphics tasks (for example, worst case FurMark on a GTX480, using 300 watts) and heavy full power CUDA programs on the same GTX480 using only 200 watts.

You can also see optimizations like Optimus for laptops which dynamically power down the GPU completely (0 watts!). That’s extremely elegant, getting a great power savings for the common idle case, by letting a separate low capability, but low-wattage chip handle 2D display.

hmm I think someday this might be handy, if we can reduce power by lowering the GPU clock in some cases and then improvethis by disabling some Cuda Cores manuallyor automatically as well, more power can be saved, but yeah it might be too complicated to be added but it might come in handy for future products, especially them power saving products to make them more power efficiant. B)

This way each new design has a certain level of complexity, but every new idea is worth being added or mixed with that of an other :) So yeah, feel free to use this idea NVIDIA B)

Reducing frequency is a more effective way to save power compared to switching off of processing elements.

Switching off processing elements, you get a power consumption that scales linearly with processing throughput. Reducing frequency instead, you also get a linear relation, but on top of that you can save power by reducing voltage. Power is approximately proportional to the voltage squared, so voltage reduction is an effective way of saving power.

I wish NVIDIA releases a GPU for no cost at all… I could pay per kernel usage… Probably a GPU with a credit-card reader! ;-) Compute on Demand… (Well, not my words … I think IBM introduced it…)

I wish NVIDIA releases a GPU for no cost at all… I could pay per kernel usage… Probably a GPU with a credit-card reader! ;-) Compute on Demand… (Well, not my words … I think IBM introduced it…)

You just ignored the leakage currents that also affect idle logic gates. Your statement would be only valid in an ideal world.

Switch off the power for some circuits and you also eliminate leakage.

You just ignored the leakage currents that also affect idle logic gates. Your statement would be only valid in an ideal world.

Switch off the power for some circuits and you also eliminate leakage.

True, and as transistors get smaller, this static leakage current becomes a larger component of the total power draw. Not sure what the ratio is now between the dynamic component (scaling with freq and V^2) and the static component, though.

True, and as transistors get smaller, this static leakage current becomes a larger component of the total power draw. Not sure what the ratio is now between the dynamic component (scaling with freq and V^2) and the static component, though.

Well the way I got to this idea was first looking on how SMP works with CPU’s, a 2 way OpteronDP 2380 C2 which I am using at the moment has 8 cores in total right and via the task manager you can set an affinity to each core; likewise this way you could use one core for MP3’s and let the others do nothing as going into idle mode here by saving power! Or use them for some other application, game or other command what serves your mind.

Here what my system has CPU Core and GPU core wise, not showing the Cuda Cores though:
External Media

Then I thought, well if we can set affinities with our multi cored CPU’s, why not do this with multi cored GPU’s also? But then by selecting the right amount of Cuda cores for a certain app and disable the ones you don’t need, this with reducing the clock speed can in both ways save you much more power than only lowering the clock speeds, this can also be a way to devide Cuda Core load over several applications, like what we can do with multi core CPU’s, just a thought there.

To disable cores and reenable them does exist, a good friend of mine that studies computer engineering calls this idea of mine “Clock Gating”, which basically does take place.

When I ran a program that was CPU intensive like FSX with all 8 cores compared to just 1, the power usage was alot more, when all 8 cores were used. So this got me to this idea for GPU’s, but if a driver can allow us to disable cores, maybe setting them to a certain affinity might be more doable?

Well the way I got to this idea was first looking on how SMP works with CPU’s, a 2 way OpteronDP 2380 C2 which I am using at the moment has 8 cores in total right and via the task manager you can set an affinity to each core; likewise this way you could use one core for MP3’s and let the others do nothing as going into idle mode here by saving power! Or use them for some other application, game or other command what serves your mind.

Here what my system has CPU Core and GPU core wise, not showing the Cuda Cores though:
External Media

Then I thought, well if we can set affinities with our multi cored CPU’s, why not do this with multi cored GPU’s also? But then by selecting the right amount of Cuda cores for a certain app and disable the ones you don’t need, this with reducing the clock speed can in both ways save you much more power than only lowering the clock speeds, this can also be a way to devide Cuda Core load over several applications, like what we can do with multi core CPU’s, just a thought there.

To disable cores and reenable them does exist, a good friend of mine that studies computer engineering calls this idea of mine “Clock Gating”, which basically does take place.

When I ran a program that was CPU intensive like FSX with all 8 cores compared to just 1, the power usage was alot more, when all 8 cores were used. So this got me to this idea for GPU’s, but if a driver can allow us to disable cores, maybe setting them to a certain affinity might be more doable?