I am a new GPU user. I wanna ask if there is any method of optimizing speed of any NVIDIA GPU (e.g. QUADRO FX 5800). I want to optimize GPU performance according to a specific application so that it consumes least power. For that, I will not always be using all the CUDA CORES. How can I use a specified number of CUDA CORES or other features so that least power consumption is achieved? Any help in this respect is highly appreciated.
Thread scheduling is controlled at the hardware level and is not configurable.
SMs which have no work burn a significant fraction of the power of a busy SM. The best strategy is to use every SM and get your computation done faster so the device can drop the clock rates of ALL the cores when the kernel is completed. Changing the clock rate is the primary way of saving power. That dynamic clock rate is controlled by the driver (called “PowerMiser”).
Is it possible to get the clock rate to drop between kernel calls? I’ve noticed that the power usage only drops once the CUDA context is destroyed at the end of my program. I have a daemon that performs a GPU calculation for network clients, and it would be nice if I could sit at idle power until a request came in.
Not at the moment. I’ll consider an API for this.
(there are reasons why you don’t want it to happen automatically)
I can imagine some latency if this happened automatically, though an API call that I could trigger after 20 seconds idle would be perfect for my application.