How to use two SX6536 switches to build a full line speed and congestion free network?

Hello guys,

We’ll build a cluster with 910 computing nodes and two sets of storage systems, thus we are going to order two SX6536 switches, each of which has 34 18-port FDR leaf modules. Thus the two switches will have 2* 612=1224 ports. My first question then comes out that if we use a fat-tree topology for the cluster, can we get a network which is free of congestion and full line speed? And what kind of fabric connections is the best topology to get the congestion free network?

I asked the Mellanox technicians in China, but they seemed not good at it. Some guys said according to some modelling results, we had to use additional edge switches to get the congestion free network. I’m very confused about it!

As for the modelling, I’v read the paper “Infiniband Congestion Control: Modelling and validation”, and I’m going to do some modeling on the congestion problem. I download the OMNet++ Infiniband Flit Level Simulation Model at http://www.mellanox.com/page/omnet; http://www.mellanox.com/page/omnet; however, I did not find the “ccmgr” module in that model. The Mellanox technicians in China don’t know the module, too. Thus, my second question is that normally what kinds of modeling software are used to modelling the Infiniband network congestion problems. If using the OMNet++ Infiniband Flit Level Simulation Model, I also ask for help that where I can find the “ccmgr” module (or IB CC extension)?

Your help will be very appreciated!

To add to this, check also.

Understanding Up/Down InfiniBand Routing Algorithm https://community.mellanox.com/s/article/understanding-up-down-infiniband-routing-algorithm

InfiniBand, Gateway and Long Haul Solutions https://community.mellanox.com/s/article/infiniband--gateway-and-long-haul-solutions

Ophir.

Hi Y Sheng,

Below are 2 links that will help you with your questions:

Designing an HPC Cluster with Mellanox InfiniBand Solutions

Designing an HPC Cluster with Mellanox InfiniBand Solutions https://community.mellanox.com/s/article/designing-an-hpc-cluster-with-mellanox-infiniband-solutions

Here you can find several examples of how to design a cluster.

InfiniBand Topology Generator | NVIDIA InfiniBand Topology Generator | NVIDIA

Here you will fine the Mellanox InfiniBand Configurator which is an online tool to configure clusters based on a FAT Tree Topology with two levels of switch systems.

Marlon