TeraGrid SU

superGPU · April 28, 2008, 8:36am

Hello Ladies and Gentlemen – I was wondering if anyone familiar with TeraGrid might be able to fathom a SU conversion value for a GPU-hour of work in a cluster environment?

I know there are lots of details that’d have specified, but I was just wondering if someone with experience might have a ballpark-range estimate.

Thanks!

MisterAnderson42 · April 28, 2008, 1:56pm

For molecular dynamics ( [url=“http://www.ameslab.gov/hoomd”]http://www.ameslab.gov/hoomd[/url] ) a single Tesla GPU performs equivalently to ~30 CPU cores in a fast cluster. Most speedups I’ve seen reported are in a similar ballpark. Some algorithms that aren’t particularly well adapted to the data-parallel architecture are slower, while others are faster.

This thread: [url=“The Official NVIDIA Forums | NVIDIA”]The Official NVIDIA Forums | NVIDIA has some more examples, though most are comparing single/multi-core CPUs to a single GPU and not clusters.

rplzzz · April 29, 2008, 1:59pm

If it were me, I wouldn’t necessarily set the SU cost based on the GPU’s speedup relative to the other cores in your cluster. What you really have is a capacity planning problem; you want your GPUs to be used with roughly the same duty cycle as the rest of your cluster. All other things being equal, that would suggest that a GPU-hour should be valued relative to a node-hour by a factor equal to the speedup, but there are several factors that would change that. For instance, if you have fewer GPUs than nodes, that would make GPU-hours relatively more valuable. Offsetting that, the relative newness of GPU programming will probably make some users reluctant to try it, which would tend to make GPU-hours relatively less valuable. My guess is that for now users’ resistance to change will dominate, meaning GPUs should cost less than the speedup ratio, at least until GPU programming is widely accepted amongst users.

If your users will tolerate tinkering with the SU cost structure, I would recommend starting with a reasonable guess and seeing what the queue for jobs requesting GPU resources looks like. Then adjust the cost over time until the utilization for the GPUs is approximately the same as the rest of the system.

[Edited to correct a small logic error in the original]

-rpl

Topic		Replies	Views
GPU vs CPU performance comparison CUDA Programming and Performance	9	15027	August 13, 2009
performance tesla vs intel core duo CUDA Programming and Performance	6	16262	January 28, 2009
CUDA on demand offerings CUDA Programming and Performance	6	13652	July 14, 2010
Underperforming Tesla/Titan CUDA Programming and Performance	3	729	March 8, 2019
CUDA functions performance CUDA Programming and Performance	3	639	September 14, 2017
Replacement for SGE grid ? CUDA Programming and Performance	7	5054	July 7, 2009
Acceptable pricing for farmed GPU computation CUDA Programming and Performance	5	7445	January 27, 2011
Is GPU worth it? GPU currently too slow. CUDA Programming and Performance	16	6043	December 8, 2008
I want to Implement 10.000 Cores in GPU, each making an arithmetic equation, is possible to do: I wi CUDA Programming and Performance	4	1637	February 4, 2016
Performance gap for a short test code between GPU and CPU CUDA Programming and Performance	8	1875	October 26, 2017

TeraGrid SU

Related topics