Building my own Tesla workstation

Please don’t crucify me if I’m posting in the wrong place. The NVIDIA “Build your own Tesla workstation” linked to this CUDA forum (see for yourself at the bottom of the page: http://www.nvidia.com/object/tesla_build_your_own.html)

OK, so, here are my hardware specs of what I’m intending to purchase for a workstation with one (but possibly later two) Tesla C2050s.

Please let me know if I’ve made any obvious mistakes or if I have to post more details! If you think it looks fine, please say so… I want to buy this machine right away!

(everything below conforms to the ATX form factor)

Ultra 4x 1600 Watt Power Supply (This one is “SLI-Ready”, is that important if I want to use two Tesla’s in one workstation?)

Midtower Cooler-master chassis (Will midtower be a really tight fit for a 1600 watt power supply, or is it fine because it’s ATX?)

ASUS P6T7 WS Supercomputer Motherboard (Socket LGA1366, 7 PCIe slots, “SLI-Read”)

Intel Core i7 930 2.8 GHz 8Mb L3 Cache

6 GB DDR3 1600Mhz RAM (3*2GB, Non-ECC… is 6GB enough… the motherboard said it only supports 16GB max)

NVIDIA GT9500 Video Card (for display purposes only, $60 on CompUSA right now… shrugs)

… and harddrives, 64 bit operating systems, etc.

Thanks!

-Chris

This is probably massive overkill for a pair of C2050s and a 9500GT display card. The power draw of a C2050 is only 238W. “SLI-Ready” is semi-useful as an indication that the power supply is designed to run multiple cards, but is not a requirement. You should be fine with a 900W power supply for your application, although it would be helpful to verify that it can deliver enough of that power to the 12V PCI-E rails to run your cards. (PSU makers can be rather sneaky about how they add up their quoted power limits.)

This is a very nice motherboard. I am quite happy with it.

The only significant CPU memory requirement is how much you need for your application. As far as I know, CUDA does not care.

Well bad RAM might slow down your CPU when shifting memory and might decrease performance when memcopying. I read a benchmark article where it was stated that you would need at least 1033 MHz RAM for an i7 for not limiting it. But yes in this case it shouldnt be rly relevant.
For the power supply: Ensure it has enough 6/8/whatever pin connectors for both your cards (dont know how many you need, guess 1 6-pin and 1 8-pin per card), I suppose SLI indicates it has enough.
I have a P6T7, too but when accessing the NVIDIA X Server Settings on my Ubuntu machine it shows the slot I plugged my card in is only PCIe gen 1. Can you confirm this? I used the right slot for, I know. I also think that this shouldnt matter as long as you dont want to do concurrent h2d/d2h copies. Am I right in this point? My bandwidth in the bandwidthTest from the SDK does look ok (h2d 5813.7 MB/s, d2h 6101.0 MB/s with pinned memory).

It is true that you do see an improved host-device (and vice versa) bandwidth if you run the Core i7 in the triple channel configuration (compared to double channel). I haven’t seen anyone test difference clock rates, though.

My CUDA workstation is not easily accessible at the moment (on travel), so I can’t check the PCI-E issue. I can confirm that my bandwidthTest values looked file, however.

Hey, thanks for the quick feedback!

Ok yeah, I’ll get a smaller PSU then. I was a little worried about the PCIe being Gen1 instead of Gen2 (only because NVIDIA warned against Gen1). Can someone confirm whether the P6T7 Motherboard has Gen2 or Gen1 slots? (I’m getting ready to duck in case someone throws a rock with the words “google it you lazy bum” enscribed on it).

Others can chime in here but be very careful and reconsider the Ultra PSU for such an expensive rig. Over the years, its reputation in various forums has been terrible often blowing up expensive equipment in the process of dying itself. I’ve had good luck with multiple Thermaltake and PC Power & Cooling units.

HTH, Vince

Yeah! That warning about Ultra is exactly the reason why I posted here! Thanks!

Yes, I recommend Thermaltake Toughpower PSUs as well. I would not use the consumer Thermaltake ones though… not because I’ve had bad experiences with one , but because a bad PSU is the most common cause of subtle hardware issues (pretty rare, though) and a blown PSU is by far the most common cause of physical damage to machines (very rare, but devastating, especially if it’s to some Tesla cards.)

I looked for the PCIe gen before and did again now. The P6T7 is said to offer 7 PCIe gen2 slots. Nonetheless my NVIDIA X Server Settings says my cards are plugged into gen1 slots. I would be happy if anyone could verify that this is shown to him, too, and just a little X Server Settings error.

ONeil, could you have accidentally plugged the Tesla into a gen1 slot? I just finished building my workstation according to the specs at the top, and on this motherboard the PCI slots alternate gen2, gen1, gen2, gen1 etc. (well, the manual says they alternate as “x8” and “x16” but I’m assuming that’s what gen1 gen2 means)

-Chris

ONeil, could you have accidentally plugged the Tesla into a gen1 slot? I just finished building my workstation according to the specs at the top, and on this motherboard the PCI slots alternate gen2, gen1, gen2, gen1 etc. (well, the manual says they alternate as “x8” and “x16” but I’m assuming that’s what gen1 gen2 means)

-Chris

That’s not true, but the effect is the same. PCI Express connections are formed out of 1 or more serial links, called “lanes”. When describing the capability of a slot, the number of lanes is specified using notation like “x1” or “x16”. Also note that there is a distinction between “physical” and “electrical” capacity. A slot can be physically x16, but only x8 electrical to save on PCI-Express lanes in the chipset. All PCI Express cards able to query how many lanes the motherboard provides in the slot and work with that.

In addition to this, PCI Express 1.0 and 2.0 define different signaling protocols, and 2.0 is twice as fast as 1.0 per lane. So a PCI Express 2.0 x8 link is about the same speed as a PCI Express 1.0 x16 link. When PCI Express 3.0 comes out, we will see another doubling of bandwidth per lane again for new devices.

That’s not true, but the effect is the same. PCI Express connections are formed out of 1 or more serial links, called “lanes”. When describing the capability of a slot, the number of lanes is specified using notation like “x1” or “x16”. Also note that there is a distinction between “physical” and “electrical” capacity. A slot can be physically x16, but only x8 electrical to save on PCI-Express lanes in the chipset. All PCI Express cards able to query how many lanes the motherboard provides in the slot and work with that.

In addition to this, PCI Express 1.0 and 2.0 define different signaling protocols, and 2.0 is twice as fast as 1.0 per lane. So a PCI Express 2.0 x8 link is about the same speed as a PCI Express 1.0 x16 link. When PCI Express 3.0 comes out, we will see another doubling of bandwidth per lane again for new devices.