I already have 4 Teslas, packaged, in the box. Need a motherboard, preferably dual CPU.
Dual CPU AM3 motherboard with full PCIe 2.0 x16 support for 4 Tesla C1060s. Does any such motherboard exist? Essentially similar to the Foxconn Destroyer, but with dual CPU support and PCIe 2.0 x16 support for all Teslas.
Couldn’t find it in the supported hardware list.
If not, second preference is a Socket 1366, dual CPU with full PCIe 2.0 x16 support for 4 Tesla C1060s. Anyone know of one? Similar to but not the Tyan S7025. Jury is still out if that really supports 4 Teslas out of the box (for space considerations).
Or if anyone knows any website where high end motherboards can be searched for based on specifications ( like the CNET one http://reviews.cnet.com/motherboards/), then that would be great too.
Socket AM3 doesn’t support multiple CPUs - neither the socket, nor the CPUs designed for it have the requisite extra HT link to permit it. About the best you will be able to do is a 6 core Thuban Phenom II and a 790FX or 890FX motherboard, which will get you 6 cpu cores with 2 channel DDR3 memory and (theoretically at least) 4 x 8 PCI-e lanes hanging off a 8Gb/s HT3.0 link.
If you want something like the Tyan 7025, then a socket C32 or G34 board with 4000 or 6000 series Opterons is probably about the best you can do.
Yes, p6T7 Supercomputer here too for my 4x setup. It Just Works. Never any MB issues, though lots of heat issues… case choice is critical.
I’ve also had good luck with the $120 cheaper P6T 3-way Revolution board… 3 GPUs is just easier in terms of case, PSU, and heat.
I even built one box with the cheapest $185 P6TSE board with 3 GTX295s (x16 x16 x4, but my app wasn’t PCIE limited)… works great.
My preferred “Just Works” combo right now is the P6T Revolution, the Silverstone FT02 case, an i7 930 (or 980X if CPU power is needed), 12GB of RAM, BFG 1200 Watt PSU. GPU mix could be 3x GTX295 or sometimes 2xGTX295 and 1 GT240. I’ll likely switch it to 2xGTX480 1x GTX295 this month.
If I were building a new 4x system I’d probably try the Silverstone Raven2 case, the “gamer” version of the FT02 case. It’s gaudy, but has the same rotated motherboard and giant 180 cm fans for GPU cooling as the FT02, but it has a full 8 slot cutouts (the FT02 is disappointingly only has 7)
Now, if you’re really needing dual CPU… you’re on your own. With speedy hexacore chips like the 980x, it’s no longer critical to have dual cpu sockets. (ah, fond 1998 memories of the insane performance dual CPU Celeron 300A hack machines, overclocked to 450 MHz!)
What about the supermicro mb that all the pre-built systems use? that is duel socket and has 5 PCIe !! (they normaly ship with 4 teslas and a quadro fx.
You can see such a pre-built system here… http://3xs.scan.co.uk/ShowSystem.asp?SystemID=1027
but it must be possible to just buy the mb though
Anyone have any experience with that pre-built SuperMicro system, or just the chassis? How loud is it?! Quite expensive though ($1500 for the barebones system, $800 for the chassis, and hard to find as parts).
I’m glad to hear that the Asus WS series seems to work well. Are there issues with the NForce 200 bridge chips?
I’m looking to build a 4x PCI-E 16x capable computer (with Xeon 5600/3600 etc) that is not incredibly loud – it has to go in my office. Just a Quadro 580 + C2050 for now, but want room to add 2 more C2050 later.
Thanks for the tip on the FS02. Anything else you would consider? I’d much prefer huge low RPM fans than the small 6x 50 dB fans in the SuperMicro case.
If you are building your own, you’ve got to give serious thought to temperature control. I have a Themaltake Armor+ case, with a 1200 watt power supply and 3 Tesla C1060’s plus 1 Quadro FX 3700, dual quad core 3Ghz CPUs, 16GB of RAM. Heat is a major problem. I think I have 12 fans in the system all together, but I finally solved the temperature control problem by mounting a 120V fan on the outside of the plastic case side, over the large case fan. I installed an inline speed control and I wired in an electronic switch to turn the fan on when I turn the system on (powered from the 12V inside of the case). Most of the time I keep 2 out of the 3 Tesla boards disabled from within the Windows 7/64 device manager and keep the external fan at a relatively slow speed. When I enable all Tesla’s, I have to crank up the external fan speed by a lot. This external fan blows directly on the 4 NVIDIA boards, that are only seperated by about 1/16-1/8 of an inch, with enough velocity to force the air between the boards. That seemed to be the key in keeping the temperature down. I bought some external booster fans to mount on the outside back of the case to help pull air through the Tesla boards, but I didn’t have to do that once I got the external side fan working. These configurations just cannot be made quiet. They always operate at a low to loud roar! At least mine does. I’ve looked into active noise suppression like my 4 Cray CX1 deskside supercomputers at work have, but I never found such.
I have an Intel D5400XS motherboard which requires LGA771 CPU form factor, which I populated with two Core 2 Extreme quad core CPU’s. Even though this was an NVIDIA recommended motherboard at one time, I think it was obsoleted by Intel a month after I purchased mine last year. The required CPU’s and RAM for this mb are too expensive. And you just end up with enough space to fit 3Tesla’s plus a Quadro FX.
It’s a blazing personal machine, but it cost me over $10K to build up. (Of course there are a lot of things I didn’t mention above such as 4 1TB drives and several expensive OS versions that jacked the price up)
Yeah that is the way I have all our rigs set up - blowing cool air with a high flow rate fan parallel to the GPUs at the fan end. I have a bunch of pretty tightly packed white box cluster nodes setup like that which run 24/7 flat out and the GT200s rarely get over 60C. The dual GT200 workstations we have are built into CoolMaster HAF922 cases which has a huge front fan which does the same thing, and the GPUs run even cooler than that.
We built ours (Tyan s7025 mb) in the lian-li case that NVIDIA recommend, it’s a very roomy case and heat seems not to be a problem (C1060’s run flat out at around 65 degrees, while our overclocked GTX285’s run at about 95 degrees!!) (lots of air flow from 5 14cm fans that come with the case. We’ve got 2 Zeon 5500 in and they are by far the loudest component, don’t use the stock intel cpu coolers spend some money on a decent and quiet cooler. All in our machines are fine to sit on your desk in an office, they are comparable to normal budget Dell desktops in terms of noise. At one point we put all 6 together and that was both a bit noisy and generated more heat in the office than a small electric radiator!! If you put one under a desk, work out how to route the airflow away from you otherwise the heat will build up under the desk and make it difficult to work there.
I can only speak from my experience that my Quadro FX would overtemp and cutout when I had the 3 Teslas (2 above the Quadro, and one below) going full speed ahead. It could be that some of the other things about my system might be contributing to cooling problems. As mentioned, I have 4 1TB drives and they sit between the front case fan an the NVIDIA boards. Air flow is definitely impeded. There are also a lot of wires in the air path. I could perhaps bend the wires to allow more air flow from the front fan. I could probably move the drives down to the bottom bays to improve airflow on boards, but would probably have to add a bottom drive fan. Also, I do have 2 quad core 3GHz CPUs and 16GB of RAM and a 1200 watt power supply. That’s a lot of heat to get out of the case. It is like having a good sized electric heater in my computer room when the system is running.
But, in my “case”, replacing the case side fan with the high speed 120V fan solved all of my heat problems.
It’s just something that you have to pay attention to when building up a loaded system.
I wonder if some cases have a greater spacing between PCI slots so that the air flow is better. My 4 NVIDIA boards separations are tiny, as I mentioned.
Gotta say I too am discovering 3-GPU to be much less limiting than 4x in terms of PS, case, motherboard etc. Your “Just Works” combo looks pretty good. That Silverstone FT02 really seems to have “ideal” airflow for a 1-3 gpu setup.
What is the reasoning behind the BFG PSU?
May I ask what your general application is?
Thanks for the notes about the noise levels. The last machine I built was 25 dB, so I’m a bit worried about the C2050…
The BFG has modular cables, so you can keep your case neater when you don’t need to plug in all the extra SATA power cables and so on.
BFG is not the only manufacturer of good PSUs, I just picked that one first, it worked great, and kept it. I’ve used Antec-CP1000 and Corsair-850 PSUs as well… I’d recommend either one for 2-GPU but not 3-GPU boxes.
I think the ToughPower-1500 is also popular.
I run a variety of apps… I’m a developer/researcher so I explore a wide range. Some is for 3D rendering (realtime raytracing, see www.worley.com) but other apps like Monte Carlo integral equations (random walks in complex geometry) and DNA searching. (I told you it was a variety!)
I’ve been considering purchasing a Supermicro 7046GT-TRF and 4x GTX 480s. Does anyone have any experience with this case and motherboard? The PSU is spec’d at 1400W (if using a 220 V supply, only 1100W at 120V). Would this be sufficient for the GTX 480s? The machine supports 4x C1060s, but these have a lower TDP (190W instead of 250W for the GTX 480). The case looks to have good air handling, but it’s hard to tell just from pictures.
Dual CPU machines have significantly more PCIe latency… and that can add to the inherent NUMA RAM latency.
This makes no difference to many (even most) CUDA apps, but is critical for others.
Imagine a CUDA thread running on CPU0. The GPU lives on a PCIe slot that has its lanes serviced by CPU1. So when the GPU wants to read some system memory, it sends a request via PCIE. Which CPU1 forwards to CPU0 where your thread lives. CPU0 now needs to fetch the memory, which may live on NUMA RAM banks serviced by CPU1. So CPU0 now requests the memory via a request to CPU1, which is fulfilled and reshuttled back to CPU0 which asked for it, which then packs it up for a PCIE transfer… by sending it over to CPU1 which controls the PCIE lanes you’re using. (Yes, this is worst case example, but it’s also not necessarily so uncommon.)
And yes, you can start playing explicit NUMA games like setting per-thread CPU affinity, pinned memory, etc… but that starts being a lot of accounting.
For GPU performance, it may be better to look for a single CPU where none of this matters since it’s all done locally. When you have a modern i7 hexacore CPU, the benefits of dual CPU aren’t so dramatic and can in fact be negative.
This is an interesting point, but so far the GPU clusters on the TeraGrid have 1 GPU to 1 quad-core CPU. For most applications, I think 3/4 of the cores go idle during GPU runs, but it can be useful to have enough CPU power for setup tasks, etc., that may not have been ported to GPUs. I was purchasing this primarily as a development machine, and having the extra CPU cores around of other tasks (e.g. compiling, data analysis and conversion) would be useful. As an aside, some staff at NCSA have developed a CUDA wrapper library that handles the affinity mapping:
If the affinity were handled properly, wouldn’t you double the effective PCIE bandwidth? Two sets of CPU memory controllers, with each pair of cards on a separate PCI bus…
I’ve been looking at the same system, but have yet to find anyone with one. Perhaps one could talk to a reseller (e.g., SiliconMechanics.com) for a reference. I asked them about support for the C2050, but they declined to answer. To me, it seems unlikely that 4x GTX480 would be a good idea. I wonder about 4x C2050. Perhaps there is a PSU upgrade in the works.
Another issue: would the 6x 5000 rpm fans + 4x 480’s be bearable in a quiet office?