Feasibility of Switchless InfiniBand Network for Small Cluster with Multi-Host Cards

Hi,

If I recall correctly, InfiniBand only supports switch-based connectivity. However, I would like to inquire about the official support and feasibility of configuring a switchless InfiniBand network using your multi-host ConnectX adapters (such as the ConnectX-6/7 Dual-Port or similar models).

I have seen people in this forum talking about a three-node switchless configuration using dual ports, and running three SM daemons.

Does NVIDIA officially support and validate such a switchless InfiniBand configuration using multi-port adapters in a multi-host configuration?

A potential scenario could be a 12-node cluster connected via 3 ConnectX-6/7 Dual-Port, where:

  • each card connects both ports with the other two ConnectX-6/7 Dual-Port and.
  • each card also connects internally with another three hosts using a multi-host configuration.

Thank you!

Hi,

It sounds feasible. However do you have these devices with 4 multi-host support. From what i recall these are special NICs designed by partners for their servers.

Regardless, if you want to explore a switchless Ethernet solution you can check our host-chaining capability. For 12 nodes you will get X Gbps/12 actual BW, though.

Check out this release notes (look for “Host Chaining“):

If you will need further assistance on both I suggest opening a support ticket.

Regards,

Yaniv

Hi Yaniv,

Thank you for your quick response.

I’m evaluating this configuration with a partner who has certified hardware with multi-node support.

I have attached a quick diagram to illustrate a potential setup with 8 multi-host cards, each connecting internally to three other nodes. The PoC will have three multi-host cards and a total of 12 nodes.

Do you have any documentation about this particular topic?
The ConnectX-5 documentation is quite comprehensive, but it does not include a reference to this specific topic.

Any additional information will be appreciated.

Thank you.

Regards.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.