Connecting multiple DGX Spark Units Ethernet Switch Recommendations

Has anyone thought about potentially scaling a DGX Spark cluster by connecting more than 2 DGX Spark units through an Ethernet Switch, and if so which switch would you recommend and why?

I appreciate the 200Gbe ConnectX-7 is recommended for 2 units, but if you choose to scale beyond 2 and create a mini scalable super cluster I would imagine a switch would be recommended vs. series-connecting the DGX Sparks through the ConnectX-7 cable?

Appreciate your thoughts!

Hello Maiia,

Welcome to Nvidia forum.

Your question is very sharp. Since we also don’t have the engineering sample of Spark, I don’t have more information than you.

  1. Only one CX7 port is used to connect to another Spark. There is no information describing the specific reason. Maybe the power limitation?
  2. All current information shows the Spark only supports one back-back connection with a 40cm DAC.
  3. Based on point 2, I don’t know if Spark can be linked to a switch to work-beyond-promise.
  4. In CX7 era, the DC construction uses the SN4600 switch to connect nodes, but I think all switches that are based on 200G ETH ports are OK. Since no information to show if the CX7 port supports IB, QM97xx is the current generation of CX7 card under IB mode.
  5. I don’t think it’s worth trying. In your other thread, you have got a valuable view about the scalability limitation of Spark.

Regards

Haitao Sun

1 Like

Thanks for replying, @haitaos

I do see the two ConnectX-7 ports for connecting 2 DGX Spark units together.

As a nomadic Solopreneur the thought of a portable cluster for IP privacy vs. cloud is appealing.

That’s why I’m researching potential scale beyond two DGX Sparks. I came across the Mellanox Spectrum-2 and Spectrum-3 units and wondered if the ports and the data rates needed to be at 200Gbe as well as any other technical considerations.

I’m working on my personal 2026 Forecast and as these switches are a sizeable investment even for a 16-port unit it’s therefore important when the time comes to scale, if I choose to scale, to choose the right tech.

I do hear you, we don’t have the units yet, so this might be best revisited later this year when the Founder units have been physically tested.

Adding feedback from parallel thread to close loop.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.