How to control the number of cores which are going to be used

Hi, I have a question regarding the number of cores.

In the Tesla C2070, there are 448 CUDA cores. When I do the CUDA coding, can I control how many cores that I am going to use?

Because I am thinking about the power consumption. Intuitively, the more number of cores the program uses, the more power the GPU will consume. So I want to see if there is a trade off by controling the number of cores being used.


I know they do it automatically with the last few generations of GeForce cards so I’d assume they would have implemented something so fundamental into the Tesla cards. They don’t turn off cores per se but they do downclock the card severely when no intensive 3d is being used and accordingly for the level of 3d demand. My 465 will downclock itself to 50mhz when not in 3d use. I’d assume that would save quite a bit of power especially with a power-hungry Fermi.

As for a direct answer to your question I have no idea. I’ve done the opposite with one of my 465’s by unlocking it up to 448 cores of the 470 that it really was from the 352 of a normal 465.

The CUDA context created by your program is always given all of the cores at full clock rate for execution. (Note that if you create multiple CUDA streams in your program, then it is possible for those cores to be divided between different running kernels.) The only way to deliberately lower the power consumption of your device is to underclock it, which can be hard to do these days. (Or to buy a lower power card.)

I suppose you could issue kernels with fewer blocks than multiprocessors, but I doubt you would see much power difference. Current CPUs are able to differently clock and partially power down unused CPU cores, which is where you see a major performance improvement. There is no indication that NVIDIA GPUs can do the same thing. Not to mention the performance loss caused by running so few blocks could be considerable.


But I still have a similar question:

How to address the CUDA cores?

More specifically, can I somehow control the number of cores that I want to use? Can I assign or choose certain cores to run a certain kernel function?

I want to have this kind of functions. Because it will give us more flexibility in programming.

Thanks for any reply :)

You cannot control the number of cores that a kernel will be scheduled on. The closest you could come to this ability is to launch 1 block per SM.

While addressing SMs isn’t impossible with CUDA, it most likely isn’t what you want. What are you trying to achieve? If you sketch that, we might be able to suggest a better way of doing it.

You could also group the threads on less blocks. Anyway, maybe the decrease in performance would make power consumption worse.