Building your own personal supercomputer What we did and what problems we had

I’m posting this with the aim to help others out there who may want to build their own personal supercomputer.
I’ve recently done it, I found very little advice on the net, I had a few problems so I’m putting this here.

My first port of call was here… [url=“http://www.nvidia.com/docs/IO/63567/web_DIY_PDF.pdf”]Page Not Found | NVIDIA

I wanted 4 tesla’s in one machine so I went for the Tyan s7025 motherboard. Unfortunately this board can’t actually take 4 double slot cards as advertised as the ports for the hard drive cables and so on are in the way.

The solution is to get some right angle SATA cables… [url=“http://www.amazon.co.uk/gp/product/B0001Y8UCU/ref=pd_lpo_k2_dp_sr_1?pf_rd_p=103612307&pf_rd_s=lpo-top-stripe&pf_rd_t=201&pf_rd_i=B0001Y8UI4&pf_rd_m=A3P5ROKL5A1OLE&pf_rd_r=0B4MX0FTMKF2EWNGF2GD”]http://www.amazon.co.uk/gp/product/B0001Y8...0FTMKF2EWNGF2GD[/url]

You will need 2 or 3 per machine (more if you have more than 1 HD)

We also tried PCIe extension leads but they didn’t work!
Also go into the PCIe settings in the BIOS and enable power management and set the PICe speed to 248. Having done this it works with 4 cards!!!

Fortunately I went for the Lian-Li case (which is excellent) as this case has PCI slots well below the reach of the motherboard, so with the extension cable in place the 4th card can easily be mounted there without any case modification. Other cases might also be roomy enough, I don’t know.

Next problem is the display, the Tyan s7025 motherboard has a built in Aspeed AST2050 graphics chip, so with 4 tesla’s (which have no display output) you can still see whats going on. However the onboard graphics are designed for server use, i.e. occasional plugging in a monitor to set things up, so they are essentially a glorified frame buffer… i.e. expect huge lag. More importantly it doesn’t do openGL, which means all the nice pretty CUDA demo’s won’t work and you may have issues setting up GUI’s (though not necessarily). If you want remote access, and don’t need fancy graphics output then its probably fine.
(EDIT: the ast2050 does do openGL (very slowly) but only if you haven’t installed the NVIDIA driver, once thats in the system there is no openGL)

Our solution was to buy a GTX285 (very similar spec to the Tesla C1060 and cheeper too! as other threads here will confirm) and use that as a 4th card rather than the Tesla.

This card works well and slightly faster than the tesla (it’s an over-clocked one so thats probably why)

We also had serious problems getting the Tesla cards to work as the onboard graphics is a different driver. I’ve attached a .pdf with a guide to installing them on this setup… You can get the scripts from my college Martin as mentioned in the .pdf if you need them.
UPDATE, you don’t need the scripts if you use the GTX285 with the Teslas and disable the onboard graphics in the BIOS.

So we now have a fully working system!!
Tesla_Installation_Guide.pdf (219 KB)

Good info. Thanks for sharing!

What OS are you running?

The install guide looks like a generic guide… It has nothing to do with supercomputer… or Am i mistaken?

The install guide shows how to get the NVIDIA driver working alongside a non-NVIDIA display driver which you must do if you want 4 tesla cards in this motherboard. Mostly it’s generic to the 190 NVIDIA driver, but setting up the scripts (toward the end) is specific to supercomputer if you use this motherboard (recommended by NVIDIA) and the onboard graphics.

Actually we are also having problems with the PCIe extension… I’ll post more once we have a better solution.

OS is ubuntu 9.04

Seems to be a nice guide…! One small thing…
In the point no. 7 (of the installation guide), one need not have root permissions in order to edit their own ‘bashrc’ files. One could also simply do a ‘gedit ~/.bashrc’.

This is strange. How do NVidia certify this MB for 4 Teslas if they don’t fit? Will it be too much to ask of you to post some pictures?

I was going to go down the Tyan S7025 path, but this information alone makes the decision very difficult.

Anyone else get 4 Teslas working as certified on this mb ( without using the Riser card)?

Do you mind me asking why?

It does, but the problem I encountered with the flexible risers we purchasedwas that bandwidth (particularly pinned memory bandwidth) was significantly reduced when the GPU was plugged into the flexible riser compared to when it was either directly inserted into the PCIe 2.0x16 slot or inserted into a rigid PCIex16 riser. Because of this, we used GTX 275s for our rebuilt 2U cluster (for which flexible risers were the only option) and installed our six C1060s in ATX cases. Here are the pinned and pageable bandwidthTest numbers for the GTX 275s:

[root@bdgpu-n09 ~]# /usr/local/cuda_sdk/C/bin/linux/release/bandwidthTest --memory=pinned

Running on…

  device 0:GeForce GTX 275

Quick Mode

Host to Device Bandwidth for Pinned memory

.

Transfer Size (Bytes) Bandwidth(MB/s)

33554432 2714.0

Quick Mode

Device to Host Bandwidth for Pinned memory

.

Transfer Size (Bytes) Bandwidth(MB/s)

33554432 2864.2

Quick Mode

Device to Device Bandwidth

.

Transfer Size (Bytes) Bandwidth(MB/s)

33554432 107177.7

&&&& Test PASSED

Press ENTER to exit…

[root@bdgpu-n09 ~]# /usr/local/cuda_sdk/C/bin/linux/release/bandwidthTest

Running on…

  device 0:GeForce GTX 275

Quick Mode

Host to Device Bandwidth for Pageable memory

.

Transfer Size (Bytes) Bandwidth(MB/s)

33554432 1923.5

Quick Mode

Device to Host Bandwidth for Pageable memory

.

Transfer Size (Bytes) Bandwidth(MB/s)

33554432 1765.7

Quick Mode

Device to Device Bandwidth

.

Transfer Size (Bytes) Bandwidth(MB/s)

33554432 107217.3

&&&& Test PASSED

Press ENTER to exit…

I can provide with and without flexible riser bandwidthTest numbers for the Tesla cards if anyone is interested.

As a side note, I think that calling an IBM PC compatible with 2 (or 3 or 4) graphics cards a “supercomputer” is comparable to calling a basketball player Michael Jordon - basically it’s a joke.

I’ve just updated the original post, we now have all 4 cards working!!
The solution was to use the PCIe extension cable linked in the original post, and set the PCIe power management to enabled in the BIOS, and the PICe speed to 248 in the BIOS.

Don’t know how this effects the bandwidth but all cards are recognised and working now so I’m happy.

P.S. put electrical tape on the back of the extension cable so that the pins don’t go through the cable.
Put some washers in or a riser so that the cable isn’t squashed.

Here is a picture with 1 card removed so you can see how it fits…
Photo_on_2010_04_01_at_12.23.jpg

Photo_on_2010_04_01_at_12.23.jpg

Run

[codebox]bandwidthTest --memory=pinned --device=0

bandwidthTest --memory=pinned --device=1

bandwidthTest --memory=pinned --device=2

bandwidthTest --memory=pinned --device=3[/codebox]

and post the results here, if you don’t mind

UPDATE: Still having problem with the PCIe extension cables, more often than not the computer fails to load Ubuntu, remove the extension cable and all is fine, we’ve tried with 2 extension cables both with the same problem so we are back to 3 cards per machine rather than 4 :-(

I’m going to try a PCIe riser rather than an extension. I’ll post the results when I get one.

Nice find. Four liquid cooled GTX 480 cards should be possible.

Here’s what my system has:

No of units	Total

1	Tyan S7025 (S7025WAGM2NR)

4	KINGSTON KVR1333D3D4R9SK2/8GI 8GB KIT 2X4GB 1333MHZ DDR3 ECC REG W/PAR CL9 DIMM DR X4 W/TS

2	Intel® Xeon® Processor E5520 (8M Cache, 2.26 GHz, 5.86 GT/s Intel® QPI)

1	LG GH22LS40BLACK LG GH22LS40 SATA 22X/22X *DVD-RW DRIVE, LIGHTSCRIBE

1	THERMALTAKE ARMOR+ VH6000BWS BLACK W 25CM FAN WINOW FULL TOWER 10 PCI SLOT 

2	SEAGATE ST31500341AS Seagate Barracuda 7200.11 - Hard drive - 1.5 TB - internal - 3.5" - SATA-300 - 7200 rpm - buffer: 32 MB

1	COOLMAX 1350W MODULAR PSU W/ 6&8 PIN SLI CERTIFIED CUQ-1350B

1	AMERICAN POWER CONVERSION SUA2200 SMART UPS 2200VA USB AND SERIAL 120V Pout1980W

1	Samsung 24" Widescreen LCD Monitor (2433BW) LS24CMKKFV/ZC 20000:1 1920x1200, RGB & DVI, 5ms

The cables included with the motherboard are not enough, one needs to get a couple of longer ones with right angle connectors for the MB (STARTECH SATARA36 36 in). Also, in order to get the audio and USB headers out, I got some right angle headers as adapters (SAMTEC P/N SSQ-105-03-G-D-RA). You can get these through www.digikey.com or www.newark.com, for example.

My system is a computational tool, not targeting too much graphics. As such, I do not need huge graphical throughput, and the MB’s chipset suffices for now. The OS is 64-bit Linux, which I built/compiled from scratch, straight from the source, every single bit of it. As such, it is minimal in terms of number of packages, but complete and, hopefully, distribution bug-free. Cross-LFS and BLFS was my guide. It’s up and running very nicely.

Regarding OpenGL, the nVidia installation is pretty poor. Not having an nVidia graphical device to drive, the installation still messes up a bunch of libraries, which is the reason many of the included OpenGL demos do not work. A workaround would be to back those up first, install the nVidia packages, and restore the formerly saved libraries. Of course, some of the demos would need a bit of hacking as well.

I hope this helps some of you out there.

Happy building,

Tibor

Here’s what my system has:

No of units	Total

1	Tyan S7025 (S7025WAGM2NR)

4	KINGSTON KVR1333D3D4R9SK2/8GI 8GB KIT 2X4GB 1333MHZ DDR3 ECC REG W/PAR CL9 DIMM DR X4 W/TS

2	Intel® Xeon® Processor E5520 (8M Cache, 2.26 GHz, 5.86 GT/s Intel® QPI)

1	LG GH22LS40BLACK LG GH22LS40 SATA 22X/22X *DVD-RW DRIVE, LIGHTSCRIBE

1	THERMALTAKE ARMOR+ VH6000BWS BLACK W 25CM FAN WINOW FULL TOWER 10 PCI SLOT 

2	SEAGATE ST31500341AS Seagate Barracuda 7200.11 - Hard drive - 1.5 TB - internal - 3.5" - SATA-300 - 7200 rpm - buffer: 32 MB

1	COOLMAX 1350W MODULAR PSU W/ 6&8 PIN SLI CERTIFIED CUQ-1350B

1	AMERICAN POWER CONVERSION SUA2200 SMART UPS 2200VA USB AND SERIAL 120V Pout1980W

1	Samsung 24" Widescreen LCD Monitor (2433BW) LS24CMKKFV/ZC 20000:1 1920x1200, RGB & DVI, 5ms

The cables included with the motherboard are not enough, one needs to get a couple of longer ones with right angle connectors for the MB (STARTECH SATARA36 36 in). Also, in order to get the audio and USB headers out, I got some right angle headers as adapters (SAMTEC P/N SSQ-105-03-G-D-RA). You can get these through www.digikey.com or www.newark.com, for example.

My system is a computational tool, not targeting too much graphics. As such, I do not need huge graphical throughput, and the MB’s chipset suffices for now. The OS is 64-bit Linux, which I built/compiled from scratch, straight from the source, every single bit of it. As such, it is minimal in terms of number of packages, but complete and, hopefully, distribution bug-free. Cross-LFS and BLFS was my guide. It’s up and running very nicely.

Regarding OpenGL, the nVidia installation is pretty poor. Not having an nVidia graphical device to drive, the installation still messes up a bunch of libraries, which is the reason many of the included OpenGL demos do not work. A workaround would be to back those up first, install the nVidia packages, and restore the formerly saved libraries. Of course, some of the demos would need a bit of hacking as well.

I hope this helps some of you out there.

Happy building,

Tibor

4 Teslas I hope.

Also are you able to share the kernel config file and (sheepishly) perhaps the kernel itself?

4 Teslas I hope.

Also are you able to share the kernel config file and (sheepishly) perhaps the kernel itself?

Ahh brilliant thank you I’m ordering some now

Cheers

Ahh brilliant thank you I’m ordering some now

Cheers

Yes, of course, 4 Teslas.

I would add one more thing. When you put in the SSQ right angle headers for the audio and USB, use a bit of electrical tape to cover its leads that remain exposed at the “elbow” of the header. do not use much, just a single thin layer will do. These are very low voltage signals, you do not have to worry about arking, but you ned to make sure, that under no circumstances can the body of the Tesla short any of the pins of the header. This is just a bit of extra precaution, since the Tesla and the header ought not to touch at all in the first place. There is a gap of about 1~1.5 mm remaining after you fully inset the Tesla in its socket.

I attached my kernel configuration file [attachment=16902:Openumer…nfig.tar.gz]. Keep in mind, that I am trying to build a minimal kernel/system, so it fits only the machine hardware I described. If you come accross an error therein, please let me know. Thanks.

All the best building,
Openumerix_kernel_config.tar.gz (12 KB)

Another approach is to use a case designed to host four double-wide GPU cards. There’s another thread in here somewhere about that, but a quick guide is at [url=“http://www.manifold.net/downloads/Building_an_E_Box.pdf”]http://www.manifold.net/downloads/Building_an_E_Box.pdf[/url]

Tesla C2050 has a DVI connector so you don’t need an extra video card anymore. Also, four GTX 480s or 470s make a sprightly system as well, albeit at lower performance, less RAM and no ECC like Tesla.