If I recall correctly, InfiniBand only supports switch-based connectivity. However, I would like to inquire about the official support and feasibility of configuring a switchless InfiniBand network using your multi-host ConnectX adapters (such as the ConnectX-6/7 Dual-Port or similar models).
I have seen people in this forum talking about a three-node switchless configuration using dual ports, and running three SM daemons.
Does NVIDIA officially support and validate such a switchless InfiniBand configuration using multi-port adapters in a multi-host configuration?
A potential scenario could be a 12-node cluster connected via 3 ConnectX-6/7 Dual-Port, where:
each card connects both ports with the other two ConnectX-6/7 Dual-Port and.
each card also connects internally with another three hosts using a multi-host configuration.
It sounds feasible. However do you have these devices with 4 multi-host support. From what i recall these are special NICs designed by partners for their servers.
Regardless, if you want to explore a switchless Ethernet solution you can check our host-chaining capability. For 12 nodes you will get X Gbps/12 actual BW, though.
Check out this release notes (look for “Host Chaining“):
If you will need further assistance on both I suggest opening a support ticket.
I’m evaluating this configuration with a partner who has certified hardware with multi-node support.
I have attached a quick diagram to illustrate a potential setup with 8 multi-host cards, each connecting internally to three other nodes. The PoC will have three multi-host cards and a total of 12 nodes.
Do you have any documentation about this particular topic?
The ConnectX-5 documentation is quite comprehensive, but it does not include a reference to this specific topic.