Using ConnectX-2 VPI adapters for network workstation with 2 nodes.

animation_guy2100 · April 28, 2013, 6:05pm

I am in the process of building a small rack mount setup for CG animation/rendering using Supermicro barebone servers for LGA 2011 Intel Xeon E5 chips. It it will consist of one workstation and two render nodes. I decided to go with infiniband because it was fast and relatively inexpensive to deploy as I was able purchase new a dual port Mellanox MHQH29C-XTR and two single port MHQH19B-XTR cards for $600 on Ebay. My plan is to use the dual port adapter in the workstation and the single port adapters in each of the node with each of them directly plugged into a port on the adapter in the workstation. I have used the same kind of setup for years with regular ethernet NICs without any problems but I just want to make sure that it will work as easily with Infiniband? Am I loosing any benefit with Infiniband when I don’t use a switch? I would appreciate any input as I am very new to the world of Infiniband. Thankyou!

rdarbha · April 29, 2013, 6:05pm

Yes, you can plug a cable between two ports directly but you need to start the Subnet Manager on one machine (/etc/init.d/opensmd, it can be on any machine)

What you lose is scalability, of course. But you get slightly better latency without a switch:)

animation_guy2100 · May 1, 2013, 8:48pm

I was already looking into Puppet Enterprise because it was free for up to 10 nodes.Never heard of Chef but I will give it a look. Thanks again!

EDIT: StH forums? I don’t remember ever posting anything in that forum.

animation_guy2100 · May 1, 2013, 5:47pm

I have decided to make some huge changes to my setup. I’ve found a very good deal on a bunch of Intel Xeon X5680 OEM CPUs and IBM Mellanox ConnectX-2 VPI single port cards on ebay so I decided to make my render nodes on them instead of the LGA 2011 chips. The cost savings are good enough that I’ve decided to up my number of render nodes to 10 since my V-Ray license allows me to render across that many computers. I will be still keeping my rackmount workstation a LGA 2011 Xeon setup. Since my setup will now be 1 workstation and 10 nodes I will have to get a switch so I was looking at the IS5023. Should be a very fun and interesting undertaking!

Thank you very much for the info Justin, much appreciated!

phil41 · May 2, 2013, 12:41pm

Hi animGuy, it appears we have some things in common; Renderfarmer's DIY Render Farm | ServeTheHome Forums Renderfarmer's DIY Render Farm | ServeTheHome Forums

I’m on Vray for Maya and I started out with two DP X56XX nodes 2 years ago.

A few things I wish someone had pointed out to me when I started out:

Once you get 10, you’ll want 20; plan accordingly.
You’ll want a dedicated file server/license manager/render manager for your setup.
X5680’s run very hot. Ensure adequate ambient cooling and mind your electric bill.
Make sure you have adequate power adjacent to your rack for 10+ of those machines.
IPOIB is awesome but it’s hard to bridge with a home internet connection.
IB switches are LOUD.

Shoot me a PM if you have any questions or have some advice of your own to share.

justin69 · May 1, 2013, 7:45am

siszlai Infrastructure & Networking - NVIDIA Developer Forums - Generally that’s good advice (ie “you need to run opensm”), but for the specific topology mentioned here it’s not quite right.

In a 3 node setup, where two nodes have only a single port, and the middle box has two ports doesn’t work quite like that.

OpenSM only binds to one port on a server (by default the first one on the first card), then explores/discovers network topology through just that connection. So, if it gets started on the middle box, it won’t see the 2nd port nor the other server connected through it. (this is specific to only this topology, and doesn’t happen with a switch)

The workaround is super easy, but counter-intuitive. Just run OpenSM on both of the nodes with the single port cards, and don’t run it on the middle box.

I do similar to this with some test boxes here, and run IPoIB over the top of it. Works fine that way.

(note - edited slightly for clarity)

justin69 · May 1, 2013, 8:03pm

That sounds cool, and it’s no stress at all.

Sounds like interesting fun.

Btw, saw some of your posts on StH forums.

Have you considered using centralised management tools for your nodes (something like Puppet or Chef), instead of manually hand configuring everything via ssh?

That kind of thing is what Puppet and Chef are designed to handle (both competing projects in the same space). For your setup, you don’t need the “Enterprise” version of them, just the normal Open Source Software project release would be fine.

Just saying.

animation_guy2100 · May 2, 2013, 7:16pm

Thanks for the warning. Judging from the posts the IS5023 doesn’t have that problem.

animation_guy2100 · May 2, 2013, 3:48pm

WOW, that is nice home setup. 10 is going to be good for me for a long while. I don’t even want to think about much it will cost to power a setup like this for a few days. Aside from my license of VRay for Maya the only other render I have that will let me render across that many nodes is Luxology Modo 701(which just until recently supported Linux). Will probably purchase a workstation license of Sidefx Houdini FX after I get this up and running since that also comes with unlimited render nodes and awesome VFX tools .

I would have gone with E5-2670 chips for my render nodes but I found X5680 chips for over half the price ($740 each to be exactly) of even used E5 chips and the performance difference between the two chips did not warrant double the price for the Sandy Bridge chips. For the nodes I am considering using Supermicro 1U Twin Servers to save on space. They have a model that has a Mellanox ConnectX-2 40Gbps NIC built into the motherboard of both nodes which still leaves the PCI-E 2.0x16 slot open for a LSI RAID card that can connect to the backplane of each node. Supermicro | Products | SuperServers | 1U | 1026TT-IBQF Supermicro | Products | SuperServers | 1U | 1026TT-IBQF

What I like most about this server is that it supports up to two high end Kepler GPUs, can have its own Battery Backup (though it doesn’t seem as adequate as dedicated UPS), has an on-board LSI 2208 HW RAID that can support up to 16 devices, and 2 10GBase-T NICs. The two 5.25" drive bays could easily be used to add 2.5" HD enclosures.

As far as a rack mounted UPSs and PSUs what would you recommend for a setup like this. I have worked with with rack mounted SATCOM and computer networking equipment for almost 10 yrs now (currently employed as a SATCOM SNAP Support Engineer in Afghanistan) but I have never built my own from the setup so any input would be greatly appreciated.

Thankyou very much renderfarmer for the pointers. I will be sure to PM questions in the near feature!

justin69 · May 1, 2013, 12:16am

Out of curiosity, which operating system(s) are you wanting to use for the servers and workstation?

Asking because it can make a difference, depending on what you want to do.

For example, if you’re a Linux guy, the Infiniband drivers are better supported (more mature) in RHEL/CentOS than Ubuntu. If you’re a Windows guy, the protocols available are different in Win 2K8 vs Win 2K12.

So, better to know ahead of time, etc.

phil41 · May 2, 2013, 6:01pm

Thanks.

Those X5680s will do just fine. The X56XX chips are only 10% slower per clock in Vray than the E5. That’s based on some extensive comparative testing that I did personally.

A couple of things stand out about your proposed config:

You’re render nodes don’t need RAID controllers.
The 10Base-T NICs in your 3U workstation will be a serious bottleneck for your 40Gbps Render nodes.

I’m very happy with my 3000VA CyberPower PSU. It was quite reasonably priced. The part number is in that StH link I posted for you.

So far you’re looking at well over $30k for the render nodes and networking gear alone - taking the CPU price and IS5023 you quoted into account. Take great care in planning before you proceed.

animation_guy2100 · May 1, 2013, 6:15am

I had already decided on going with CentOS Linux for this setup but I am glad to hear that it has very mature Infiniband drivers. All my CG and Compositing programs support and work best with RHEL/CentOS

.

justin69 · May 1, 2013, 8:12am

It’s pretty straight forward to get working.

Software wise, CentOS comes with Infiniband drivers and utils that you install via yum:

$ sudo yum groupinstall “Infiniband Support”

When getting cards from eBay, the firmware on them is often a bit old. Easily fixable with the Mellanox firmware burning tool (flint). Install that via yum:

$ sudo yum install mstflint

For your 3 node setup (no switch), you should install the Open Subnet Manager (OpenSM) package on both of your single port nodes (your render boxes):

$ sudo yum install opensm

$ sudo chkconfig opensm on

$ sudo service opensm start

Updating the firmware on your cards, using flint, is also pretty straightforward but is better covered another day.

Note, if you ever want to remove the Infiniband stuff from your CentOS boxes, it’s pretty easy:

$ sudo yum groupremove “Infiniband Support”

$ sudo yum remove mstflint opensm

Hope that’s all helpful.

(note - edited for typo fixes)

justin69 · May 2, 2013, 6:05pm

Is all this stuff going into a proper data center, or are you going to have it at home / office ?

Asking because the noise level for some of that stuff might be at the “can’t think straight” level if anyone’s near it.

justin69 · May 2, 2013, 4:05am

Heh, went looking back at the StH forums:

http://forums.servethehome.com/networking/ http://forums.servethehome.com/networking/

… and you’re right. I was thinking of another guy (renderfarmer Infrastructure & Networking - NVIDIA Developer Forums ) who seems to be in the same field.

Sorry.

justin69 · May 2, 2013, 5:02pm

renderfarmer wrote:

6. IB switches are LOUD.

Found that out too the first time around, so have been looking for quiet ones instead. This is very helpful:

Suggestions for quiet Infiniband switch? Infrastructure & Networking - NVIDIA Developer Forums

animation_guy2100 · May 2, 2013, 6:51pm

It is going into a spare room I have in the house near the garage. I was actually planning on purchasing a noise dampening enclosure.

animation_guy2100 · May 2, 2013, 6:44pm

So SATA2 won’t be a bottleneck, I thought that it would be. I will remember that.
About the 10Base-T NIC; I was only going to use it to network to my 2 HP Workstations running windows via Samba. Its not go to be utilized for rendering purposes.

Right now I am just ordering the CPUs for the time being. I won’t be back home from Afghanistan until I take leave at the end of July. Around that time is when I will be looking for looking for rest of the parts to build my setup.

Thanks again!

animation_guy2100 · July 22, 2013, 10:07pm

I got a question about the SX6018 switch and any input would be greatly appreciated. Will the FDR ports auto-negotiate down QDR and work with ConnectX-2 adapters? Thanks!

Topic		Replies	Views
New to infiniband, can't get a working connection.	22	2105	September 9, 2013
XenServer support	13	269	January 14, 2014
MHGH28-XTC not working InfiniBand/VPI Adapter Cards	52	650	May 25, 2013
ConnectX-3 firmware update InfiniBand/VPI Adapter Cards ethernet	11	1294	December 11, 2018
Suggestions for quiet Infiniband switch?	53	1025	May 28, 2018
HowTo configure ConnectX-3 MT27500 direct connection w/o switch InfiniBand/VPI Adapter Cards	2	1161	July 28, 2017
ConnectX-4 CX456A does not work with opensm InfiniBand/VPI Adapter Cards	9	1196	March 12, 2016
Mixing OFED 1.5.3 and 2.2 in the same network? Mellanox OFED	4	442	June 18, 2017
Very stupid MHGH28-XTC question InfiniBand/VPI Adapter Cards	8	295	June 17, 2013
How to configure host chaining for ConnectX-5 VPI InfiniBand/VPI Adapter Cards	16	2613	October 29, 2018

Using ConnectX-2 VPI adapters for network workstation with 2 nodes.

Related topics