Newbie Cluster Builder

Hello everyone,

I’m currently networking together a small collection of identical computers in order to work on solving CFD, chemical reaction, and coupled physics problems–as a hobby. My idea is to utilize older and/or commonly available cheap hardware in order to build a decently powerful and easily scalable compute cluster.

I’m not going to lie, the more feeble attempts I make at learning about older Infiniband hardware, the more I realize that I have no f****** idea what I’m doing.

At the moment (and embarrassingly) I’m running Windows Server 2012 R2 as I’m very familiar with it professionally, it comes with a number of built in tools for managing multiple computers, and most importantly, the engineering software I’m using at the moment is only licensed for Windows platforms (don’t ask me why). Also, I’m a complete idiot when it comes to anything that isn’t Ubuntu, but I’m willing to learn…eventually.

Currently my network hardware manifest includes a Qlogic Silverstorm 24 port DDR switch [9024-FC24-ST1-DDR], four Mellanox Infinihost III DDR cards [MHGS18-XTC], four Mellonox CN passive copper cables [MC1104130-002] and a 1GBE switch.

I’ve been able to run a few simulations so far on this setup, however my speed up is terrible and I shouldn’t wonder why. I’m currently using an OS based switch manager [OpenSM] along with an IP over IB driver rather than a direct networking option (at least I think that’s what’s going on). According to my HPC Cluster Manager diagnostics, my throughput is half what it should be (~1000 MB) even though the switch does indicate that DDR is active. I had to use an older Mellanox driver (2.1.2) for the cards due to them not being supported with newer driver versions, at least that is my understanding.

My objective at the moment is to get the IB setup I have now configured correctly in order to run at it’s peak speed, if that is even possible. At this point I should also note the great deal of difficultly I’m facing getting a copy of software to interface with my switch, such as FabricSuite or QuickSilver OS management. Also, it seems that the switch’s management network connector does not respond to any connections, as indicated to me by the lack of ‘Mgnt’ LED lighting up.

Needless to say I have quite a mess on my hands and would greatly appreciate any help to get this sorted out.

Thanks in advance.

https://lh5.googleusercontent.com/-JT4nVU6tEuM/VDp0VJau8dI/AAAAAAAAD8Q/eFf1l_lzUiw/w1260-h709-n-o/20141009_211620.jpg

Hello Ali and welcome to our community!

Although this community deals with really complicated matters … , I have to admit, i enjoyed reading your post. We will try to get you going.

So far it does sounds like you do have good idea of the basics and beyond. you are almost there!

Practically, you now have a DDR network which means, by theory it is capable of 20Gb/s speeds.

Your servers and HCA cards would probably give you less because of PCI bus encoding and efficiency.

Pointers for getting the max performance for HPC applications:

  • Make sure your HCAs and switch runs with the latest FW and SW available for the model.

  • Make sure that your HCAs are all connected on a PCI-E Gen2 or Gen3 bus (and that the Gen mode is enables in BIOS!)

  • Make sure that your network is healthy. you could have degraded links or links with errors.

Since i am not sure what diagnostics tools are available with the server driver version you got, i suggest you add a 5th machine to the network, running with Linux and recent MOFED version. Windows and Linux can share the same space. no problem.

from that machine run:

  • ibnetdoscover - make sure everything is discoverable, at 4X and DDR speeds

  • ibdiagnet - see if there are any errors at the PM (Performance Monitoring) section.

Next, look into tune up your server’s part. here is a good document:

http://www.mellanox.com/related-docs/prod_softwar/Performance_Tuning_Guide_for_Mellanox_Network_Adapters.pdf http://www.mellanox.com/related-docs/prod_software/Performance_Tuning_Guide_for_Mellanox_Network_Adapters.pdf

Next would be to benchmark your network:

  • run basic RDMA BW tests: point A to B, A to C and A to D - they should all return similar numbers

  • run your application on each server individually - each server should have similar result. if not, it can impact the overal result of the pack.

look at other performance related posts on this site, you may find more tips.

Good luck!