Suggested cable to link two Sparks?

Could anyone suggest an appropriate cable to link two Sparks? They will sit on top of each other, so I’d prefer a short cable.

FS.com sells a 0.5m (2ft) NVIDIA/Mellanox MCP1650-V00AE30 that’s on the list of approved devices from NVIDIA.

Ref:

Which Cables to be used for ConnectX-7? - #2 by MvB

While waiting for my Mellanox cable — you can connect two Sparks over ethernet and have NCCL work just fine. It’s slow of course, but it’s fast enough to prototype with a relatively small model.

PNY DGX Spark Stacking Cable

Mfr Part# NJAAKKR-0006 (30 AWG)

Mfr Part# NJAAKK-0006 (32 AWG)

You can find them at MicroCenter at $99.00 . Mine purchased at Micro Center was made by Amphenol:

https://www.amphenol-cs.com/product/njaakr0006.html

https://www.amphenol-cs.com/product/njaakk0006.html

Other than the wire gauge they have the same specs - at 0.5m either should be fine.

There are two approved cable SKUs

1 Like

Warning: this is not a good reference. You cannot use any random cable from this list. I bought the MCP1650-V00AE30 and it does not work.

I just ordered the NVIDIA/Mellanox MCP1650-V00AE30 Compatible 0.5m (1.6ft) 200G QSFP56 Passive Direct Attach Copper Twinax Cable 30AWG for DGX Spark Dual-System Interconnect from NADDOD. It is advertised as being compatible and they tell me they test each cable before shipping. I will let you know if it works.

I attended the Nvidia GTC in Washington last week and nobody–I mean nobody–had these cables for sale at the show, although a Dell guy tried to see if he could sell me one of their extras [but no]. An Nvidia guy I talked to said they had underestimated the demand and had only a single source.

There are a wide range of prices, too. The one I list above is $66 but shipping is $25 and I think it’s coming from China. Perhaps I will get a customs bill as well. But I ordered it yesterday and just got a notice from FedEx.

Here’s the link: Naddod cable

The cable arrived yesterday and it passed the nccl-test suite. Speed is 20 - 22 GB (not Gb) per second which is about the most we can expect. So far, seems like a good buy.

1 Like

Since the cables are out of stock at Microcenter and Amazon in addition to Naddod I found availability at FX for $88 with free shipping from the US arriving next week. They also FedEx internationally.

Also ran into this one from Complete Connect for the UK/ Europe-based folks £88

https://www.completeconnect.co.uk/product/400g-dac-qsfp112-cables-for-nvidia-passive/

I wonder if any of these alternative cables are capable of achieving 200G on Sparks? The “official” cable is 112G/lane so that would give up to 400Gb/s.

Can anyone aggregate virtual ports and test throughput? I was going to test it myself, but since our Microcenter ran out of officially supported cables, I postponed my purchase until someone can confirm that 200G is achievable (or at least closer to it).

Naddod are selling, or more like scalping Sparks with that price, so I assume the cable is the same one from the NVIDIA bundle. FS had the same cable specks.

I thought we couldn’t push to 400G aggregate that we maxed at 200G.

I saw 400G mentioned and bookmarked a post to merge ports so 1 cable would work and will test the port merge tomorrow and the second cable between 2 sparks next week.

Wow, Spark is hard to justify at $4K, no way I would buy at that price :)

The cable is still $95 with 10-day shipping but there’s not enough desperation for anyone to pay that DGX Spark price markup, specially with partners selling units at discount rates with 0% APR payment offers.

Well, a direct equivalent to the “official” stacking cable seems to be this one: NVIDIA/Mellanox Compatible 0.5m (1.6ft) 400G QSFP112 Passive Direct Attach Cable - NADDOD

@eugr @PrinceHal got that cable and said it worked. It might be worth testing whether it can carry 400G or not.

Thanks for sharing the direct equivalent!

@PrinceHal how’s the cable working for you so far?

Found another one on FS.com - @NVES - could you please confirm if this cable specs match the official one, as the official one is pretty much unobtanium now? https://www.fs.com/products/149312.html?attribute=26053&id=3713425

EDIT: no, this one is not it. It’s QFSP-DD, not QFSP-112, so won’t be compatible.

Still, a question to @NVES and other NVIDIA folks: besides those two “official” SKUs, what exact specs are needed to achieve 200Gbps on a single port? Is QFSP56 enough? Or does it have to be QFSP112, like the “official” ones? I would assume that QFSP56 ones would work, but wanted to double check, as Spark is not exactly a standard ConnectX 7 configuration…

The NADDOD cable seems to work fine. Now I have to learn how to use it, as in run vLLM with larger models using the two linked Sparks, or fine tune a larger model or quantize a larger model with 256GB combined RAM

3 Likes

Have you had a chance to aggregate two “virtual” ports and achieve 200 Gbps between two sparks in benchmarks?

I have two Sparks connected by one cable. I followed Nvidia’s directions in Spark Clustering and Stacked Sparks. Running the nccl-test script from the Stacked Sparks link I obtained 20 - 22 GB / sec transfers. Take 8 bits in a byte and allow for other overhead and that’s within reasonable expectations for a 200Gbit link. I hear it might be possible to bind two links to double the speed but would need another cable to try. When I hear for sure that people have done it I might well order another cable. I hope that answers your question (I use “bind” where you use “aggregate virtual ports” perhaps).

1 Like